The Scraper Application is a Spring Boot-based project designed to automate the process of scraping business details from Google Maps, extracting contact emails from business websites, and sending promotional emails to these businesses. It utilizes the Google Places API for data extraction and processes the information to manage and execute email communications. The email body content is dynamically generated by ChatGPT to ensure engagement and to reduce the risk of emails being flagged as spam. The application is designed to run as a scheduled task, allowing for regular updates and communications with newly scraped businesses.
- Automated Data Scraping: Retrieves business information from Google Maps based on specified locations and keywords.
- Email Extraction: Scrapes emails from the websites of the scraped businesses.
- Email Communication: Sends promotional emails to businesses using predefined email templates and AI-generated content from ChatGPT.
- Status Management: Tracks and updates the status of businesses throughout the scraping and email-sending processes.
-
Data Scraping: The
ScrapedBusinessProcessor
retrieves unprocessed locations and fetches business details using the Google Maps API. These details are stored in the database asScrapedBusiness
entities. -
Email Extraction: The
ScrapedEmailProcessor
processes businesses with websites, extracting emails using theJsoup
library and updates theScrapedBusiness
entities. -
Email Sending: The
SendEmailProcessor
retrieves businesses with email addresses and sends promotional emails using predefined templates and content generated by ChatGPT. The status of each business is updated accordingly. -
Error Handling: Throughout the process, any errors encountered are logged, and the status of the affected business is updated to reflect the issue.
- Java 17
- Maven
- Docker (for running PostgreSQL integration tests)
- Google Cloud Platform (GCP) API Key: Required for accessing the Google Places API to scrape business details.
- OpenAI ChatGPT API Key: Required for generating dynamic email content.
-
Clone the repository.
-
Configure the application properties in
src/main/resources/application.properties
to include your Google API key, OpenAI ChatGPT API key, database connection details, and other necessary configurations. Here is an example of the configuration:# Google Places API Key google.api.key=YOUR_GCP_API_KEY # OpenAI ChatGPT API Key openai.api.key=YOUR_CHATGPT_API_KEY # Database Configuration spring.datasource.url=jdbc:postgresql://localhost:5432/yourdatabase spring.datasource.username=yourusername spring.datasource.password=yourpassword
-
Execute the SQL scripts provided in the Database Schema section to set up the required database tables.
-
Build the project using Maven:
mvn clean install
-
Run the application (CORE):
mvn spring-boot:run -pl :core
-
Run the application (SCHEDULED):
mvn spring-boot:run -pl :scheduled