How to Web Scrape Using Python, Snscrape & HarperDB

Create a HarperDB Account: Sign up on https://harperdb.io/ or sign in at https://studio.harperdb.io/.
Create a HarperDB Cloud Instance: Follow instructions to create a cloud instance for storing and fetching scraped data.
Configure HarperDB Schema and Table: Create a schema (e.g., "data_scraping") and a table (e.g., "tweets") with a hash attribute.
Install Required Packages: Install "harper-sdk-python" (pip install harperdb) and "snscrape" (pip install snscrape).
Import Packages: Import necessary packages for Twitter scraping and HarperDB.
Connect to HarperDB Cloud Instance: Connect to the cloud instance using the instance URL, username, and password.
Create Function to Record Scraped Tweets: Define a function to insert scraped data into the "tweets" table.
Scrape Tweets Using snscrape: Use snscrape to scrape tweets based on a search query and save them to the table.
View the Tweets Table: Access your HarperDB cloud instance to view the scraped data in the "tweets" table.

Creating Custom Functions with HarperDB (Optional):

Enable Custom Functions: Enable Custom Functions in HarperDB Studio.
Create a Project: Create a project with a specified name, generating necessary files.
Define a Route: Create a route to fetch data from the "tweets" table using SQL.
Access Data via API Endpoint: Send an API request to the defined route to retrieve the data.

‍

Dev Center