Tag: Hadoop

Recursion in Hive – Part 1

In Part 1 of this series, Valentin Nikotin, will teach you about migrating from RDBMS to Hive, while maintaining the simplicity and flexibility of a SQL approach.

Read More >

What is Big Data and Do You Really Need it?

Resized copy 3-ignore

Are you interested in transforming your business potential, but stuck asking the same old question, “What is Big Data? Pythian’s CTO and Oracle Ace, Alex Gorbachev, explains everything you need to know.

Read More >

Step-by-Step Upgrades to Cloudera Manager and CDH

Lately, several of our security conscious clients have expressed a desire to install and/or upgrade their Hadoop distribution on cluster nodes that do not have access to the internet. In such cases the installation needs to be performed using local…

Read More >

Log Buffer #443: A Carnival of the Vanities for DBAs

This Log Buffer Edition finds and publishes blog posts from Oracle, SQL Server and MySQL.

Read More >

Calculating Business Days in HiveQL

One of the common tasks in data processing is to calculate the number of days between two given dates. You can easily achieve this by using Hive DATEDIFF function. You can also get weekday number by using this more obscure…

Read More >

Watch: Hadoop vs. Riak

Every data platform has its value, and deciding which one will work best for your big data objectives can be tricky—Alex Gorbachev, Oracle ACE Director, Cloudera Champion of Big Data, and Chief Technology Officer at Pythian, has recorded a series…

Read More >

Watch: Hadoop vs. HBase

Every data platform has its value, and deciding which one will work best for your big data objectives can be tricky—Alex Gorbachev, Oracle ACE Director, Cloudera Champion of Big Data, and Chief Technology Officer at Pythian, has recorded a series…

Read More >

Avro MapReduce Jobs in Oozie

Normally when using Avro files as input or output to a MapReduce job, you write a Java main[] method to set up the Job using AvroJob. That documentation page does a good job of explaining where to use AvroMappers, AvroReducers,…

Read More >

Is X a Big Data Product?

Virtually everyone in data space claims today that they are a Big Data vendor and that their products are Big Data products. Of course, if you are not in Big Data then you are legacy. So how do you know whether a product is a Big Data product?

Read More >

Small Files on MapR-FS

One of the well-known best practices for HDFS is to store data in few large files, rather than a large number of small ones. There are a few problems related to using many small files but the ultimate HDFS killer…

Read More >
Page 2 of 41234