Basics of Reading and Writing from Geobase in GeoParquet
Introduction
This guide will walk you through the process of loading GeoParquet files into a GeoPandas DataFrame and saving them to a “Geobase” database. Additionally, it will cover how to read data from “Geobase” and save it back to a GeoParquet file.
Prerequisites
Ensure you have the following libraries installed:
geopandas
sqlalchemy
geoalchemy2
You can install any missing libraries using pip:
pip install geopandas sqlalchemy geoalchemy2
Cloning the GeoParquet Repository
First, clone the GeoParquet repository to obtain the example parquet file.
!git clone https://github.com/opengeospatial/geoparquet.git
Writing GeoParquet to Geobase
To load a GeoParquet file into a GeoPandas object, use the geopandas.read_parquet
function. Then, save the GeoPandas object to Geobase using the geopandas.to_postgis
function.
Define the Path to Your GeoParquet File
geoparquet_file = '/content/geoparquet/examples/example.parquet'
Read the GeoParquet File into a GeoDataFrame
import geopandas as gpd
gdf = gpd.read_parquet(geoparquet_file)
gdf.head()
Define the Connection String for Your geobase Database
from sqlalchemy import create_engine
db_connection_string = "postgresql://username:password@hostaddress.geobase.app:port/database"
engine = create_engine(db_connection_string)
Load the GeoDataFrame
into geobase
gdf.to_postgis(name="geoParquet_test", con=engine, if_exists='replace')
Reading from Geobase and Saving as GeoParquet
Data can be loaded from Geobase and saved as GeoParquet by following the steps below:
- Load data from Geobase as a
GeoDataFrame
- Save data as GeoParquet by calling the
to_parquet
method
Read Data from Geobase
sql = """
SELECT * FROM parks
"""
df = gpd.read_postgis(sql, con=engine, geom_col="location")
Save Data as GeoParquet
df.to_parquet("example.parquet")
The file with a .parquet
extension will be saved in the desired directory.