This is an error that is easy to miss by eye but that causes problems with processing the xml. The first example I came across was at Iliad 5.719, where the proper noun Ἀθήνη is lemmatized as the following unicode string: 787 7936 952 ... Here 787 is a combining comma above, and 7936 is an alpha with a smooth breathing mark. So you have the breathing mark in there twice: once as a combining character and once built into the composed ἀ. If you view the string on a screen, the result will depend a little on what software is rendering it. For example, in the terminal program I use, the combining comma is almost on top of the breathing mark, so it looks like a slightly fatter breathing mark.
This seems to occur repeatedly, but not 100% of the time, for the following lemmas representing proper names (which are in the xml as lowercase): ἀπόλλων, ἀλέξανδρος, ἀφροδίτη, ἀτρείδης, ἀθήνη, ἀχαιός, ἀνδρομάχη, ἰδομενεύς, ὠκεανός, ὀδυσσεύς, ἀσκληπιάδης, ἀντίλοχος, ἀχιλλεύς, ἀγχίσης, ἀλκίνοος, ἀρήτη, ὠγυγία.
Also: ἐνψύω at iliad 8.382, and some other non-proper nouns: ἐννοσίγαιος, ἀμφίμαχος.