An introduction to Geoslurp =========================== **Geoslurp is a pure-python module which allows downloading/updating/querying of different datasets in a spatially aware database (PostgreSQL with PostGIS).** The idea behind Geoslurp is to centrally manage datasets, which has the following advantages: - It provides and encourages a central point to access to various datasets - Established database functionatility can be exploited by allowing (spatial) queries of the datasets, *joins* (querying a combination of datasets) - Sharing is encouraged, users share data by default, making it possible for other users to use the registered data and avoiding copies of large datasets - It allows for consistent versioning of datasets - The use of an 'off-the-shelf' database provides a standard and mature interface reachable from various programming languages. - Large datafiles don't necessarily need to be in the database. The metainformation, required for the queries, will be stored together with a link to where the data can be found (e.g. a local file path or an online url). This allows relatively light-weight databases to be set up. - Provide ways to standardize downloading/updating/registering of datasets Requirements ------------ To use Geoslurp one must essentially :ref:`install ` the Geoslurp Python package, and have a running PostgreSQL database with PostGIS. Geoslurp is currently not yet in PyPI, but this is foreseen in the future once a relatively decent first version is developed. What the future holds ----------------------- At the time of writing, the geoslurp tools are still under development. But taking a look in the future, one may already philosophize about possible features: - Manage a geoslurp instance by means of a website-frontend. Currently, the main way to interact with geoslurp is to use the :ref:`geoslurper command line script `. This may be too cumbersome to learn for occasional users. - Hide the database access behind a REST-API. This may be more secure when public instances will be set up and will facilitate the interaction with other webservices. - Encourage reproducibility by registering scientific processing pipelines.