Lead Generative AI Engineer (Diffusion Models, 3D, VLM)
Edensign - Boston, MA
Apply NowJob Description
Company DescriptionEdensign is building the future of AI-powered visual and spatial engine. Backed by the Harvard Innovation Labs, we're creating next-generation intelligent systems that merge generative AI, 3D understanding, and spatial intelligence to transform how real-world spaces are visualized, staged, and experienced.Contact Email: DescriptionFull-time | Preference for Boston based candidatesWe're looking for a senior technical leader to drive the development of our core AI engine. The ideal candidate has deep experience training large generative models, including diffusion, 3D reconstruction networks, multimodal, VLM architectures. In this role, you will spearhead model training pipelines, R&D experiments, data strategy, and foundational architecture decisions.This is an opportunity to help build the next generation of spatial AI - from multi-view consistency to 2D-to-3D-to-2D transformation and advanced scene understanding.Key ResponsibilitiesDesign, train, and optimize cutting-edge generative models, including diffusion, 3D reconstruction, and multimodal/VLM architecturesBuild and manage scalable training pipelines, data curation workflows, and experiment trackingLead research experiments, benchmarking, and exploration of new modeling techniquesArchitect the evolution of our spatial AI stack"”from prototyping new ideas to deploying production-ready modelsCollaborate with engineering and product teams to integrate AI capabilities seamlessly into real-world workflowsMake strategic decisions around infrastructure, GPU utilization, model efficiency, and training optimizationContribute to Edensign's long-term technical roadmap and innovation directionQualificationsStrong expertise in training generative models (diffusion, GANs, 3D generative models, or scene-reconstruction networks)Deep background in Computer Vision, Computer Graphics, 3D geometry, NeRF-like architectures, or multi-view learningFamiliarity with node-based generative tools (e.g., ComfyUI) is a plusExperience with VLMs, multimodal models, grounding, or spatial reasoning is highly valuableProficiency in Python and modern ML frameworksHands-on experience with distributed training, GPU optimization, and large-scale experiment managementAbility to work independently and lead technical direction in a fast-paced startup environmentStrong analytical, problem-solving, and system design skillsExcellent communication and collaboration skillsMaster's or PhD in Computer Science, AI/ML, Computer Vision, or a related fieldExperience in real estate, architecture, spatial design, or spatial computing is a bonusProficiency in Mandarin is preferred
Created: 2026-01-24