Login Sign Up

Data Engineer

Eventible

4 - 6 years

Pune City

Posted: 29/05/2026

Getting a referral is 5x more effective than applying directly

Job Description

About Eventible

Eventible is a B2B event intelligence platform. We track thousands of business events globally and build structured data around event agendas, sponsors, speakers, exhibitors, topics, and audience signals. That data powers our products, outreach, and internal intelligence systems.

As our data stack grows, we need someone who can help build and run reliable web data pipelines at scale.


Role Overview

We are hiring a Data Engineer Web Data & Automation to build and maintain LLM-assisted web extraction workflows that collect, structure, store, enrich, and clean data from public web sources.

This is a hands-on execution role. The work includes building web scrapers, handling APIs, using browser automation where needed, storing extracted data cleanly, and improving data quality through enrichment, validation, deduplication, and QA.

This is not a marketing operations role, campaign role, or reporting role. It is a build-and-run role for someone comfortable working with web data, automation, and applied AI.


Key Responsibilities
  • Build and maintain web data collection pipelines for public sources such as event websites, sponsor pages, speaker directories, exhibitor lists, agendas, and landing pages
  • Develop LLM-assisted extraction workflows to pull structured information from messy or inconsistent web pages
  • Use APIs, scraping tools, and browser automation to collect data reliably across static and dynamic websites
  • Store extracted data in a clean, usable format across databases, spreadsheets, or other internal systems
  • Enrich and standardise raw data, including company names, job titles, industries, locations, event metadata, and taxonomy mapping
  • Clean, validate, deduplicate, and QA collected data before it enters downstream workflows
  • Debug failed jobs, broken selectors, parsing issues, rate limits, and workflow errors
  • Work with internal stakeholders to improve data coverage, accuracy, and usability
  • Document workflows, extraction logic, schemas, prompts, and SOPs so systems are maintainable over time
  • Support the ongoing improvement of Eventibles internal data and automation stack


Requirements
  • 24 years of experience in data engineering, web scraping, automation engineering, backend-heavy data workflows, or similar hands-on roles
  • Strong working knowledge of Python
  • Experience with web scraping, parsing, and structured data extraction from HTML, JSON, APIs, or public websites
  • Comfortable using tools and libraries such as Requests, BeautifulSoup, Playwright, Selenium, Scrapy, or equivalent
  • Comfortable working with LLM APIs for extraction, classification, cleanup, enrichment, or validation tasks
  • Able to read API documentation and make working GET/POST calls independently
  • Experience storing and handling data in databases or data stores such as Postgres, MySQL, MongoDB, Supabase, BigQuery, or similar
  • Strong data cleaning and QA instincts, including handling duplicates, inconsistent naming, missing fields, and edge cases
  • Good working knowledge of Google Sheets / Excel for validation, reconciliation, and quick analysis
  • Able to work independently, debug problems, and ship reliable solutions without heavy hand-holding


Good to Have
  • Experience scraping dynamic sites or sites protected by inconsistent page structures
  • Familiarity with tools such as Apify, Firecrawl, Browserbase, Postman, n8n, Zapier, or Make
  • Basic understanding of proxies, rate limits, retries, job scheduling, and anti-breakage patterns
  • Experience with cloud storage, cron jobs, containerised jobs, or lightweight backend deployment
  • Exposure to taxonomy mapping, entity resolution, or enrichment workflows


What Were Looking For

The right candidate is practical, technical, and resourceful. They do not stop at the page structure changed. They investigate, adapt, and fix. They are comfortable working with messy public-web data and understand that the job is not just to extract data, but to make it usable.

We are looking for someone who can build, test, debug, and improve pipelines with a high degree of ownership.


Location & Reporting
  • Location: Pune, in-office
  • Reporting to: Senior Manager, Data & Insights





Services you might be interested in

We Search & Apply Jobs for You!

Our team scans through 1000s of opportunities and applies to roles best suited to your profile

Save 100+ hours and focus on what matters - cracking interviews and landing offers.