Showing posts with label Databases. Show all posts
Showing posts with label Databases. Show all posts

Monday, July 22, 2024

Cloudera - Streaming Data Platform

Cloudera has a significantly mature streaming offering on their Data Platform. Data from varied sources such as rich media, text, chat, message queues, etc is brought in to their unified DataFlow platform using Nifi or other ETL/ ELT. After processing these can be directed to one or more of the Op./ App DB, Data Lake (Iceberg), Vector DB post embedding (for AI/ ML), etc.

Streaming in AI/ ML apps help to provide a real-time context that can be leveraged by the apps. Things like feedback mechanism, grounding of outputs, avoiding hallucinations, model evolution, etc all of them require real-time data to be available. So with a better faster data, MLOPs platform Cloudera is looking to improve the quality of the ML apps itself running on them.

Cloudera has also made it easy to get stared with ML with their cloud based Accelarators (AMP). AMPs have support for not just Cloudera built modules, but even those from others like Pinecode, AWS, Hugging Face, etc & the ML community. Apps for Chats, Text summarization, Image analysis, Time series, LLMs, etc are available for use off the shelf. As always, Cloudera continues to offer all deployment options like on-premise, cloud & hybrid as per customer's needs.

 

Saturday, November 11, 2023

Starting off with Databases

A note shared with a friend on getting started with DB on a Windows env. Putting it up for the larger audience.

1) Try MySql (or Postgre) DB Online via Browser:

• https://onecompiler.com/mysql/3zt5uh4dc
• https://www.w3schools.com/mysql/mysql_exercises.asp
• https://www.w3schools.com/mysql/exercise.asp
• https://www.mycompiler.io/online-mysql-editor
• https://www.mycompiler.io/new/mysql
• https://extendsclass.com/postgresql-online.html


2) Install MySql on Windows:

• https://dev.mysql.com/downloads/installer/


3) Run Windows Virtual Machine (VM) with a MS Sql DB installed in that VM:

3.1) Use either VMPlayer or VirtualBox as the virtualization software

VMWare Player: https://www.vmware.com/in/products/workstation-player.html
Oracle VirtualBox: https://www.virtualbox.org/wiki/Downloads


3.2) Download the corresponding VM:

• https://developer.microsoft.com/en-us/windows/downloads/virtual-machines/


3.3) Login to the VM & install MS Sql server in it

Hope it helps.

Friday, March 20, 2015

Teradata

Teradata busy getting a chunk of the BigData pie. Teradata Parrallel Transporter (TPT) and Adv. SQL Combo makes querying Big Data sources fast and efficient using state of the art caching and other optimizations.