Basics of Reading and Writing from Geobase in GeoParquet

Introduction

This guide will walk you through the process of loading GeoParquet files into a GeoPandas DataFrame and saving them to a “Geobase” database. Additionally, it will cover how to read data from “Geobase” and save it back to a GeoParquet file.

Prerequisites

Ensure you have the following libraries installed:

  • geopandas
  • sqlalchemy
  • geoalchemy2

You can install any missing libraries using pip:

pip install geopandas sqlalchemy geoalchemy2

Cloning the GeoParquet Repository

First, clone the GeoParquet repository to obtain the example parquet file.

!git clone https://github.com/opengeospatial/geoparquet.git

Writing GeoParquet to Geobase

To load a GeoParquet file into a GeoPandas object, use the geopandas.read_parquet function. Then, save the GeoPandas object to Geobase using the geopandas.to_postgis function.

Define the Path to Your GeoParquet File

geoparquet_file = '/content/geoparquet/examples/example.parquet'

Read the GeoParquet File into a GeoDataFrame

import geopandas as gpd
 
gdf = gpd.read_parquet(geoparquet_file)
gdf.head()

Define the Connection String for Your geobase Database

from sqlalchemy import create_engine
 
db_connection_string = "postgresql://username:password@hostaddress.geobase.app:port/database"
engine = create_engine(db_connection_string)

Load the GeoDataFrame into geobase

gdf.to_postgis(name="geoParquet_test", con=engine, if_exists='replace')

geoparquet_test

Reading from Geobase and Saving as GeoParquet

Data can be loaded from Geobase and saved as GeoParquet by following the steps below:

  • Load data from Geobase as a GeoDataFrame
  • Save data as GeoParquet by calling the to_parquet method

Read Data from Geobase

sql = """
SELECT * FROM parks
"""
df = gpd.read_postgis(sql, con=engine, geom_col="location")

Save Data as GeoParquet

df.to_parquet("example.parquet")

The file with a .parquet extension will be saved in the desired directory.