Data Crawling And Scraping Engineer
As a Data Crawling and Scraping Engineer, you'll play a critical role in acquiring and structuring high-value external data that powers our core products. Your work will fuel our knowledge graph of millions of entities and directly support our mission to deliver transparency and insight into complex global networks. Your knowledge of core web technologies, such as networking, DNS, CAPTCHAs, APIs, WAFs, and proxies will open the vast scale of open source intelligence data for our internal stakeholders and our customers.
You'll work closely with engineering, research, and product teams to identify new data sources, develop reliable pipelines to gather, ingest, and structure that data, and continuously improve our ability to scale and adapt. You'll have ownership over how information flows into our platform — from design and architecture to reliability and performance — and help shape the systems that underpin our next generation of features and products.
What You'll Do:
- Design and implement systems to crawl, scrape, extract, and normalize external data from a variety of web-based sources.
- Collaborate with researchers and analysts to identify new sources of valuable company, economic, and supply chain data and define integration strategies.
- Build robust, scalable pipelines that ingest structured and semi-structured data into our database.
- Ensure high levels of accuracy, coverage, and freshness across incoming data streams.
- Contribute to the evolution of our data platform and internal tooling.
- Improve system reliability, observability, and performance over time.
You Should Apply If You...
- Have 3+ years of experience as a backend or full-stack software engineer.
- Have intimate knowledge of how to crawl the internet at scale.
- Have strong programming skills, especially in Python.
- Have experience working with structured and unstructured data from diverse external systems.
- Are comfortable debugging complex issues involving networking, content rendering, or inconsistent source data.
- Are proficient with SQL and relational databases.
- Are a clear communicator who collaborates effectively with both technical and non-technical teammates.
- Are passionate about turning raw data into meaningful insight, and eager to work on technically nuanced challenges.
Bonus Points If You Have...
- Familiarity with headless browser automation or techniques for collecting data from dynamic content sources.
- Expertise in the architecture, technologies, and tools that run the modern internet such as DNS, networking, CDNs, WAFs, CAPTCHAs, proxies and reverse proxies.
- Experience with event-driven architecture.
- Eagerness to incorporate new technologies and validate their usefulness using structured experiments and thorough testing.
- Experience building health monitoring and observability tools for consumption by automated tools, engineers, and non-technical stakeholders.
What You'll Love About Wirescreen
At WireScreen, you'll do high-impact work that helps shape global commerce and policy. We're a mission-driven team with a growth mindset—curious, collaborative, and unafraid to take on bold challenges. You'll be empowered to act, heard when you speak, and supported as you grow. With strong market momentum and ambitious goals, this is an exciting time to join us and help build something that truly matters.
Benefits & Perks
At WireScreen, we care deeply about our team and are committed to supporting your well-being—both in and out of the workplace. Here's how we take care of our employees:
- Competitive compensation including salary, equity, and rapid growth potential
- 100% company-paid Medical, Dental, and Vision coverage for employees
- FSA, HSA, and 401(k) options to help you plan for healthcare expenses and retirement
- Generous paid time off plus company-wide holidays to help you rest and recharge
- Commuter benefits for NYC and D.C.-based employees
- Hybrid office schedule for NYC-based and D.C.-based employees