Skip to content

IndexError when words parameter is too high with keywords.keywords() #83

@Achuttarsing

Description

@Achuttarsing

How to reproduce the error :

from summa import keywords

text = """Automatic summarization is the process of reducing a text document with a \
computer program in order to create a summary that retains the most important points \
of the original document. As the problem of information overload has grown, and as \
the quantity of data has increased, so has interest in automatic summarization. \
Technologies that can make a coherent summary take into account variables such as \
length, writing style and syntax. An example of the use of summarization technology \
is search engines such as Google. Document summarization is another."""

keywords.keywords(text, words=30)

produces :

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-56-e1afaa84dab3> in <module>()
      1 text = """Automatic summarization is the process of reducing a text document with a computer program in order to create a summary that retains the most important points of the original document. As the problem of information overload has grown, and as the quantity of data has increased, so has interest in automatic summarization. Technologies that can make a coherent summary take into account variables such as length, writing style and syntax. An example of the use of summarization technology is search engines such as Google. Document summarization is another."""
      2 
----> 3 keywords.keywords(text, words=30)

2 frames
/usr/local/lib/python3.7/dist-packages/summa/keywords.py in <listcomp>(.0)
    101     # reduced by the provided ratio, else, the ratio is ignored.
    102     length = len(lemmas) * ratio if words is None else words
--> 103     return [(scores[lemmas[i]], lemmas[i],) for i in range(int(length))]
    104 
    105 

IndexError: list index out of range

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions