Best Tools/Open Source Libs

Oryx2 project realization of lambda architecture

Oryx 2 is a realization of the lambda architecture built on Apache Spark and Apache Kafka, but with specialization for real-time large scale machine learning. It is a framework for building applications, but also includes packaged, end-to-end applications for collaborative filtering, classification, regression and clustering.

Lambda Architecture is a useful framework to think about designing big data applications. Nathan Marz designed this generic architecture addressing common requirements for big data based on his experience working on distributed data processing systems at Twitter.

Some of the key requirements in building this architecture include: Fault-tolerance against hardware failures and human errors Support for a variety of use cases that include low latency querying as well as updates Linear scale-out capabilities, meaning that throwing more machines at the problem should help with getting the job done Extensibility so that the system is manageable and can accommodate newer features easily.

Image – Lambda architecture
  1. All data entering the system is dispatched to both the batch layer and the speed layer for processing.
  2. The batch layer has two functions: (i) managing the master dataset (an immutable, append-only set of raw data), and (ii) to pre-compute the batch views.
  3. The serving layer indexes the batch views so that they can be queried in low-latency, ad-hoc way.
  4. The speed layer compensates for the high latency of updates to the serving layer and deals with recent data only.
  5. Any incoming query can be answered by merging results from batch views and real-time views.

Developers can consume Oryx 2 as a framework for building custom applications as well. Following the architecture overview below,If you’re looking to deploy a ready-made, end-to-end application for collaborative filtering, clustering or classification,here are the steps to follow :

Learn about the REST API endpoints here API Endpoint Reference

Image- Oryx2

Oryx2 consists of three tiers

  1. Lambda Tier – Providing base implementation which is not specific to machine learning. It internally contains side-by-side cooperating layers of the lambda architecture:
  • A Batch Layer, which computes a new “result” (think model, but, could be anything) as a function of all historical data, and the previous result. This may be a long-running operation which takes hours, and runs a few times a day for example.
  • A Speed Layer, which produces and publishes incremental model updates from a stream of new data. These updates are intended to happen on the order of seconds.
  • A Serving Layer, which receives models and updates and implements a synchronous API exposing query operations on the result.
  • A data transport layer, which moves data between layers and receives input from external sources
  1. ML Tier Implementation – The ML tier is simply an implementation and specialization of the generic interfaces mentioned above, which implement common ML needs and then expose a different ML-specific interface for applications to fill in.
  2. End-to-end Application Implementation Tier – In addition to being a framework, Oryx 2 contains complete implementations of the batch, speed and serving layer for three machine learning use cases. These are ready to deploy out-of-the-box, or to be used as the basis for a custom application:
  • Collaborative filtering / recommendation based on Alternating Least Squares
  • Clustering based on k-means
  • Classification and regression based on random decision forests

Download link : https://github.com/OryxProject/oryx/releases

Karthik

Allo! My name is Karthik,experienced IT professional.Upnxtblog covers key technology trends that impacts technology industry.This includes Cloud computing,Blockchain,Machine learning & AI,Best mobile apps, Best tools/open source libs etc.,I hope you would love it and you can be sure that each post is fantastic and will be worth your time.

Share
Published by
Karthik

Recent Posts

Looking Back at 2024: A Year of Innovation and Growth on Upnxtblog

As we wrap up 2024, it’s time to reflect on the incredible journey we’ve had…

3 weeks ago

Developing a Strong Disaster Recovery Plan for Your Business

Operating a business often entails balancing tight schedules, evolving market dynamics, and shifting consumer requirements.…

4 weeks ago

How to Secure Your WordPress Hosting by Upgrading Your Login URL

Of course, every site has different needs. In the end, however, there is one aspect…

1 month ago

Social Media Marketing: A Key to Business Success with Easy Digital Life

In today's digital-first world, businesses must adopt effective strategies to stay competitive. Social media marketing…

1 month ago

Best 7 AI Tools Every UI/UX Designer Should Know About

62% of UX designers now use AI to enhance their workflows. Artificial intelligence (AI) rapidly…

1 month ago

How AI Enhances Photoshop Workflow: A Beginner’s Guide

The integration of artificial intelligence into graphic design through tools like Adobe Photoshop can save…

2 months ago

This website uses cookies.