Posts Categorized: Big Data
A little while ago I blogged about (and open sourced) an Impala-powered soccer visualization demo, designed to demonstrate just how responsive Impala queries can be. Since not everyone has the time or resources to run the project themselves, we’ve decided to host it ourselves on an EC2 instance.
As per many previous IOUG/OAUG/Quest shows, Pythian is will be in Denver next week! It was a sunny day in the fall of 1991 when I gave my first paper at International Oracle User Week (IOUW), a pre-cursor to COLLABORATE and a few earlier incarnations called IOUG-Live and IOUG-Alive! It has been a whirlwind of…
Building a secure Hadoop cluster requires protecting a number of services which comprise Hadoop infrastructure. If you are using CDH distribution, then Cloudera Manager (CM) is one of the components that needs to be secured. There is a good step by step guide in CM documentation, and it’s easy to follow for one server, but what when you have hundreds of them? There are different approaches to the problem of managing server’s configuration at scale, but I’d like to focus on Ansible which is a neat framework for parallel commands execution and complex rollouts.
Modern commercial supercomputing in the age of Datafication is what we today call Big Data. I think a better term for it would be Data Supercomputing but the industry has already spoken so Big Data it is. The architecture shifted from environments that required massively-parallel compute-intensive number crunching to massively-parallel data-volume-intensive processing.
HDFS authentication model changed in recent releases, but documentation is stale which can lead people into thinking HDFS is using very primitive authentication
Following my “Building Integrated DWH with Oracle and Hadoop” webinar for IOUG Big Data SIG, I got a bunch of excellent follow up questions. The most frequently asked questions are: What is the minimum I need to do to get started with Hadoop? and How do I load data into Hadoop? Since so many people are interested in the same question, it makes more sense to answer on the blog.
Shortly before we all went on break for the holiday, Oracle announced the new BDA X3-2. Now I have time to properly sit down with a glass of fine scotch and dig into the details of what is included in the release. Turns out that there are quite a few changes packed in. We are getting new hardware, new Hadoop, new Connectors and new NoSQL. Tons of awesome features are included. Let’s get into it.
This Log Buffer Edition has covers everything that happening at Oracle Open World and more in Log Buffer #289.
Tuesday morning at OOW is always occupied by this forum, an opportunity for authors and other persons to receive heads up on what’s coming down the pipe from Oracle. The following notes are musings from yours truly as I attended the forum today.
The day was action packed with sessions and I presented my Oracle rman:Don’t Forget the Basics to an enthusiastic crowd in Moscone west. The room was close to full with some hanging out at the back. It has been a while since I have presented in a room so close to being full. There was a handful of questions and comments during the session and a group of attendees approached me to follow-up afterwards.