Best Tools/Open Source Libs

Oryx2 project realization of lambda architecture

Oryx 2 is a realization of the lambda architecture built on Apache Spark and Apache Kafka, but with specialization for real-time large scale machine learning. It is a framework for building applications, but also includes packaged, end-to-end applications for collaborative filtering, classification, regression and clustering.

Lambda Architecture is a useful framework to think about designing big data applications. Nathan Marz designed this generic architecture addressing common requirements for big data based on his experience working on distributed data processing systems at Twitter.

Some of the key requirements in building this architecture include: Fault-tolerance against hardware failures and human errors Support for a variety of use cases that include low latency querying as well as updates Linear scale-out capabilities, meaning that throwing more machines at the problem should help with getting the job done Extensibility so that the system is manageable and can accommodate newer features easily.

Image – Lambda architecture
  1. All data entering the system is dispatched to both the batch layer and the speed layer for processing.
  2. The batch layer has two functions: (i) managing the master dataset (an immutable, append-only set of raw data), and (ii) to pre-compute the batch views.
  3. The serving layer indexes the batch views so that they can be queried in low-latency, ad-hoc way.
  4. The speed layer compensates for the high latency of updates to the serving layer and deals with recent data only.
  5. Any incoming query can be answered by merging results from batch views and real-time views.

Developers can consume Oryx 2 as a framework for building custom applications as well. Following the architecture overview below,If you’re looking to deploy a ready-made, end-to-end application for collaborative filtering, clustering or classification,here are the steps to follow :

Learn about the REST API endpoints here API Endpoint Reference

Image- Oryx2

Oryx2 consists of three tiers

  1. Lambda Tier – Providing base implementation which is not specific to machine learning. It internally contains side-by-side cooperating layers of the lambda architecture:
  • A Batch Layer, which computes a new “result” (think model, but, could be anything) as a function of all historical data, and the previous result. This may be a long-running operation which takes hours, and runs a few times a day for example.
  • A Speed Layer, which produces and publishes incremental model updates from a stream of new data. These updates are intended to happen on the order of seconds.
  • A Serving Layer, which receives models and updates and implements a synchronous API exposing query operations on the result.
  • A data transport layer, which moves data between layers and receives input from external sources
  1. ML Tier Implementation – The ML tier is simply an implementation and specialization of the generic interfaces mentioned above, which implement common ML needs and then expose a different ML-specific interface for applications to fill in.
  2. End-to-end Application Implementation Tier – In addition to being a framework, Oryx 2 contains complete implementations of the batch, speed and serving layer for three machine learning use cases. These are ready to deploy out-of-the-box, or to be used as the basis for a custom application:
  • Collaborative filtering / recommendation based on Alternating Least Squares
  • Clustering based on k-means
  • Classification and regression based on random decision forests

Download link : https://github.com/OryxProject/oryx/releases

Karthik

Allo! My name is Karthik,experienced IT professional.Upnxtblog covers key technology trends that impacts technology industry.This includes Cloud computing,Blockchain,Machine learning & AI,Best mobile apps, Best tools/open source libs etc.,I hope you would love it and you can be sure that each post is fantastic and will be worth your time.

Share
Published by
Karthik

Recent Posts

Finding the Right Time to Build Your Software Instead of Buy

There are few things as valuable to a business as well-designed software. Organizations today rely…

7 days ago

Innovators in Crypto: Prominent AI-Powered Coins

The cryptocurrency industry is being reshaped by the fusion of blockchain technology and artificial intelligence…

3 weeks ago

Top AI Design Tools Every Graphic Designer Should Use in 2024

Introduction Artificial Intelligence (AI) has also found its relevance in graphic design and is quickly…

2 months ago

Transforming Industries: The Integration of AI and Blockchain

Imagine a world where the brilliance of Artificial Intelligence (AI) meets the unbreakable security of…

2 months ago

How Can I Use Automation to Streamline My Digital Marketing Efforts?

In today’s fast-paced digital landscape, automation is not just a luxury but a necessity for…

2 months ago

Top 5 AI Technologies Transforming the Casino Gaming Landscape in 2025

The world of casino gaming has leveraged the emerging technology advancements to create immersive and…

3 months ago

This website uses cookies.