I came across some code today that used a set to keep track of previously seen values while iterating over a sequence, keeping just the not-seen-before items. A brute force kind of thing:
seen = set() unique =  for item in sequence: if not item in seen: unique.append(item) seen.add(item)
I remembered that I had come up with a simple form of this a while ago, using a list comprehension to do this in a single expression. I dug up the code, and wrapped it up in a nice little method. This version accepts any sequence or generator, and makes an effort to return a value of the same type as the input sequence:
def unique(seq): """Function to keep only the unique values supplied in a given sequence, preserving original order.""" # determine what type of return sequence to construct if isinstance(seq, (list,tuple)): returnType = type(seq) elif isinstance(seq, basestring): returnType = type(seq)('').join else: # - generators and their ilk should just return a list returnType = list try: seen = set() return returnType(item for item in seq if not (item in seen or seen.add(item))) except TypeError: # sequence items are not of a hashable type, can't use a set for uniqueness seen =  return returnType(item for item in seq if not (item in seen or seen.append(item)))
My first pass at this tried to compare the benefit of using a list vs. a set for seen – it turns out, both versions are useful, in case the items in the incoming sequence aren’t hashable, in which case the only option for seen is a list.