SEMANTIC BACKEND ENGINEER

INFUSE
Full-time
Remote
Posted on a month ago

Job Description

INFUSE is seeking an applied ML engineer to own the semantic ingestion pipeline, transforming raw PDFs into tagged, summarized, and searchable assets for INKHUB's B2B content catalog. The role involves building and maintaining the ETL pipeline, applying filtering logic, generating embeddings, and implementing a semantic search API.

Responsibilities

  • Own the ETL pipeline from raw PDFs to structured resources
  • Finalize summarization and classification flow
  • Apply filtering logic to enforce resource quality
  • Map assets to topic taxonomy
  • Generate dense embeddings
  • Load and query embeddings
  • Implement freshness logic
  • Build a QA/eval harness
  • Expose semantic search API
  • Collaborate on UX integration

Requirements

  • Python, PyTorch, sentence-transformers, OpenAI APIs
  • FastAPI, Milvus or pgvector, PyPDF/Tika, Airflow or Lambda
  • Docker, GPU scheduling, Athena/Redshift SQL
  • Experience building ML pipelines
  • Experience with semantic search and embeddings
  • Experience with unstructured data

Benefits

  • No benefits