Onehouse is a cloud-native, fully managed lakehouse service built on Apache Hudi. It combines the ease of use of a data warehouse with the scalability of a data lake, allowing users to ingest data at scale with minute-level freshness, centrally store it, and make it available to any downstream query engine and use case. Onehouse was created by the creators of Apache Hudi and is used by companies like Uber, Amazon, and ByteDance.
• scalable
• cost-efficient
• lakeview (free tool)
• apache xtable
• apache hudi
• supports all query engines
• ingest data from all sources in minutes
• fully managed cloud data lakehouse
Backend Engineer (India)
Onehouse is a fully managed cloud data lakehouse built on Apache Hudi that ingests data in minutes and supports all query engines.
Benefits:
Equity Compensation
Health & Well-being
Financial Future
Location
Generous Time Off
Experience Requirements:
6+ years of experience as a backend engineer with experience developing and operating microservices in a distributed environment.
Experience with Kubernetes, gRPC and Java.
Experience deploying applications on one or more global cloud platforms (AWS/GCP/Azure).
Operational excellence in monitoring/deploying/testing microservice architectures.
Great problem-solving skills, a keen eye for details
Responsibilities:
Collaborate with the team to implement new APIs, database/caching abstractions and services, to support user-facing products, external partner integrations or infrastructure tooling/monitoring systems.
Exhibit full ownership of product features, including design and implementation, from concept to completion.
Be passionate about designing for future scale and high availability, while possessing a deep understanding of common failure patterns and their remediations.
Uphold a high engineering bar around the code, monitoring, operations, automated testing, release management of the platform
Show more details
Data Platform Engineer (India)
Onehouse is a fully managed cloud data lakehouse built on Apache Hudi that ingests data in minutes and supports all query engines.
Benefits:
Equity Compensation
Health & Well-being
Financial Future
Location
Generous Time Off
Experience Requirements:
3+ years of experience in building and operating data pipelines in Apache Spark.
2+ years of experience with workflow orchestration tools like Apache Airflow, Dagster.
Proficient in Java, Maven, Gradle and other build and packaging tools.
Adept at writing efficient SQL queries and trouble shooting query plans.
Experience managing large-scale data on cloud storage.
Responsibilities:
Be the thought leader around all things data engineering within the company - schemas, frameworks, data models.
Implement new sources and connectors to seamlessly ingest data streams.
Building scalable job management on Kubernetes to ingest, store, manage and optimize petabytes of data on cloud storage.
Optimize Spark applications to flexibly run in batch or streaming modes based on user needs, optimize latency vs throughput.
Tune clusters for resource efficiency and reliability, to keep costs low, while still meeting SLAs
Show more details
Data Infrastructure Engineer (US)
Onehouse is a fully managed cloud data lakehouse built on Apache Hudi that ingests data in minutes and supports all query engines.
Benefits:
Competitive Compensation
Equity Compensation
Health & Well-being
Financial Future
Location
Experience Requirements:
Strong, object-oriented design and coding skills (Java and/or C/C++ preferably on a UNIX or Linux platform).
Experience with inner workings of distributed (multi-tiered) systems, algorithms, and relational databases.
You embrace ambiguous/undefined problems with an ability to think abstractly and articulate technical challenges and solutions.
An ability to prioritize across feature development and tech debt with urgency and speed.
An ability to solve complex programming/optimization problems
Responsibilities:
Design new concurrency control and transactional capabilities that maximize throughput for competing writers.
Design and implement new indexing schemes, specifically optimized for incremental data processing and analytical query performance.
Design systems that help scale and streamline metadata and data access from different query/compute engines.
Solve hard optimization problems to improve the efficiency (increase performance and lower cost) of distributed data processing algorithms over a Kubernetes cluster.
Leverage data from existing systems to find inefficiencies, and quickly build and validate prototypes.
Show more details
Average Rating: 0.0
5 Stars:
0 Ratings
4 Stars:
0 Ratings
3 Stars:
0 Ratings
2 Stars:
0 Ratings
1 Star:
0 Ratings
No ratings available.
A federated AI framework that integrates decentralized data sources for AI development.
View DetailsOpen-source tools for data version control and AI-assisted data management at scale.
View DetailsData warehouse management solution using AI and automation for efficient data handling and analytics.
View DetailsAI-powered data management platform for seamless data integration, real-time insights, and automated decision-making.
View DetailsAnonymous, uncensored AI chat with AES encryption and no logs. Offers free and pro plans.
View DetailsWayin AI summarizes videos, supports multiple languages, and allows interactive Q&A via chatbot and screenshot queries.
View DetailsPokecut is a free AI-powered photo editor with tools for background removal, changing, and enhancement. Pro plans offer extra features and credits.
View DetailsConnect your Github repos to ChatGPT & Claude for code assistance, bug finding, and documentation. Free trial available.
View DetailsCreate and interact with a customizable AI girlfriend. Features include AI chat, roleplay, and image generation. NSFW content available.
View DetailsA trivia website with questions in multiple categories. Play now and expand your knowledge!
View DetailsArbor is an automated carbon accounting platform that helps businesses measure, analyze, and reduce their product's carbon footprint quickly and accurately.
View DetailsPhotoLog offers secure, client-side encrypted media storage with mini-site creation, easy sharing, and various storage plans.
View DetailsAI-powered mobile app testing platform with a test automation cloud (Ptero) and a no-code test scenario authoring tool (Stego).
View DetailsAI-powered productivity assistant for ADHD and knowledge workers, centralizing notes, tasks, and AI tools to enhance focus and efficiency.
View Details