Impresso API resources
Search
Search content items in the Impresso corpus.
impresso.search.find(term='Titanic', limit=10)
impresso.resources.search.SearchResource
Bases: Resource
Search content items in the impresso database.
find(term=None, order_by=None, limit=None, offset=None, with_text_contents=False, title=None, front_page=None, entity_id=None, newspaper_id=None, date_range=None, language=None, mention=None, topic_id=None, collection_id=None, country=None, partner_id=None, text_reuse_cluster_id=None)
Search for content items in Impresso.
Parameters: |
|
---|
Returns: |
|
---|
impresso.api_client.models.search_order_by.SearchOrderByLiteral = Literal['date', 'id', 'relevance', '-date', '-relevance', '-id']
module-attribute
impresso.resources.search.SearchDataContainer
Bases: DataContainer
Response of a search call.
df: DataFrame
property
Return the data as a pandas dataframe.
Entities
Search entities in the Impresso corpus.
impresso.entities.find(term="Douglas Adams")
impresso.resources.entities.EntitiesResource
Bases: Resource
Search entities in the Impresso database.
find(term=None, wikidata_id=None, entity_id=None, entity_type=None, order_by=None, resolve=False, limit=None, offset=None)
Search entities in Impresso.
Parameters: |
|
---|
Returns: |
|
---|
get(id)
Get entity by ID.
impresso.resources.entities.EntityType = Literal['person', 'location']
module-attribute
impresso.api_client.models.find_entities_order_by.FindEntitiesOrderByLiteral = Literal['count', 'count-mentions', 'name', 'relevance', '-relevance', '-name', '-count', '-count-mentions']
module-attribute
Media sources
Search media sources available in the Impresso corpus.
impresso.media_sources.find(
term="wort",
order_by="lastIssue",
)
impresso.resources.media_sources.MediaSourcesResource
Bases: Resource
Search media sources in the Impresso database.
find(term=None, type=None, order_by=None, with_properties=False, limit=None, offset=None)
Search media sources in Impresso.
Parameters: |
|
---|
Returns: |
|
---|
impresso.api_client.models.find_media_sources_order_by.FindMediaSourcesOrderByLiteral = Literal['countIssues', 'firstIssue', 'lastIssue', 'name', '-name', '-firstIssue', '-lastIssue', '-countIssues']
module-attribute
impresso.resources.media_sources.FindMediaSourcesContainer
Bases: DataContainer
Response of a search call.
df: DataFrame
property
Return the data as a pandas dataframe.
Content Items
Get a single content item by ID.
impresso.content_items.get("NZZ-1794-08-09-a-i0002")
Collections
Work with collections
impresso.resources.collections.CollectionsResource
Bases: Resource
Work with collections.
add_items(collection_id, item_ids)
Add items to a collection by their IDs.
NOTE: Items are not added immediately. This operation may take up to a few minutes to complete and reflect in the collection.
Parameters: |
|
---|
find(term=None, order_by=None, limit=None, offset=None)
Search collections in Impresso.
Parameters: |
|
---|
Returns: |
|
---|
get(id)
Get collection by ID.
items(collection_id, limit=None, offset=None)
Return all content items from a collection.
Parameters: |
|
---|
Returns: |
|
---|
remove_items(collection_id, item_ids)
Add items to a collection by their IDs.
NOTE: Items are not removed immediately. This operation may take up to a few minutes to complete and reflect in the collection.
Parameters: |
|
---|
impresso.api_client.models.find_collections_order_by.FindCollectionsOrderByLiteral = Literal['date', 'size', '-date', '-size']
module-attribute
impresso.resources.collections.FindCollectionsContainer
Bases: DataContainer
Response of a find call.
df: DataFrame
property
Return the data as a pandas dataframe.
Named entity recognition
The python library contains a set of named entity recognition methods that use the same NER model used to add entities to the Impresso database.
impresso.resources.tools.ToolsResource
Bases: Resource
Various helper tools
nel(text)
Named Entity Linking
This method requires named entities to be enclosed in tags: [START]entity[END].
Parameters: |
|
---|
Returns: |
|
---|
ner(text)
Named Entity Recognition
This method is faster than ner_nel
but does not provide any linking to external resources.
Parameters: |
|
---|
Returns: |
|
---|
ner_nel(text)
Named Entity Recognition and Named Entity Linking
This method is slower than ner
but provides linking to external resources.
Parameters: |
|
---|
Returns: |
|
---|
impresso.resources.tools.NerContainer
Bases: DataContainer
Name entity recognition result container.
df: DataFrame
property
Return the data as a pandas dataframe.
limit: int
property
Page size.
offset: int
property
Page offset.
size: int
property
Current page size.
total: int
property
Total number of results.
Text reuse
Two resources can be used to search text reuse clusters and passages.
impresso.resources.text_reuse.clusters.TextReuseClustersResource
Bases: Resource
Text reuse clusters resource.
impresso.resources.text_reuse.passages.TextReusePassagesResource
Bases: Resource
Text reuse passages resource.