ALL POSTS

Deploying Cloudera Impala on EC2 with Example Live Demo

Pythian Big Data Impala Implementation

A little while ago I blogged about (and open sourced) an Impala-powered soccer visualization demo, designed to demonstrate just how responsive Impala queries can be. Since not everyone has the time or resources to run the project themselves, we’ve decided to host it ourselves on an EC2 instance.

Meet Pythian’s Oracle Apps Experts at Collaborate 13!

There’s a rare event happening next week: All three of Pythian’s Oracle Apps tech leads will be in the same place. Vasu Balla, Maris Elsins, and recently-minted Oracle Ace Director Yury Velikanov will all be at Collab 13 this year. Since we’re a globally distributed, round-the-clock team, this is not something that can happen very often.

How I Put My Collaborate Agenda Together

It is less than a week before the Collaborate 2013 conference. Most of us are busy putting together our individual agendas for what is shaping up to be a very exciting week. There are so many interesting sessions to choose from! Conference organizers kindly introduced a Show Planner to make planning a bit easier, but in this blog post, I will share with you how I created my agenda for Collaborate.

Using Ansible to Secure Cloudera Manager Installation on a Hadoop Cluster

Building a secure Hadoop cluster requires protecting a number of services which comprise Hadoop infrastructure. If you are using CDH distribution, then Cloudera Manager (CM) is one of the components that needs to be secured. There is a good step by step guide in CM documentation, and it’s easy to follow for one server, but what when you have hundreds of them? There are different approaches to the problem of managing server’s configuration at scale, but I’d like to focus on Ansible which is a neat framework for parallel commands execution and complex rollouts.

Log Buffer #313, A Carnival of the Vanities for DBAs

The answers to the questions like whether to patch now or wait a little? What quirks are there in that stunning new features? What are the limitations of that fancy index type, any working examples of a particular add-on, are best found in the blogs. This Log Buffer Edition provides you a window to those blogs out there.

Performance Settings of Concurrent Managers

This is the second article in a series about internals and performance of concurrent managers. In this post, we’ll take a look at three important settings that affect the performance of the concurrent managers: number of processes, “sleep seconds”, and “cache size”. This article might be a bit on the theoretical side, but it should provide a good understanding of how these settings actually affect the behavior and performance of concurrent managers.

Page 42 of 258« First...102030...4041424344...506070...Last »