I was contacted by the folks at MONyog and asked if I would review MONyog. Since using MONyog is something I have been wanting to do for a while, I jumped at the chance. Of course, “jumped” is relative; Rohit asked me at the MySQL User Conference back in April, and here it is two months later, in June. My apologies to folks for being slow.
This review is an overall review of MONyog as well as specifically reviewing the newest features released in the recent beta (Version 2.5 Beta 2). Feature requests are easily delineated with (feature request). This review is quite long, feel free to bookmark it and read it at your leisure. If you have comments please add them, even if it takes a while for you to read this entire article.
While the webyog website gives some information about what MONyog can do, it is a bit vague about what MONyog is, although there is a link to a PDF whitepaper on What is MONyog? which does answer much of these questions.
The screenshots available from the website are accurate, so I will not reproduce them here. I will note that I have not shared this feedback with the webyog team yet, so I may be upset that a feature is lacking, and the feature may be implemented but I missed it. I will post a follow-up in that case, even though they will likely comment here too.
My reference points — I have used other monitoring and graphing tools such as Nagios, Cacti, and Intermapper as well as MySQL’s Enterprise Monitor.
As an overall review — MONyog is the best out-of-the-box GUI monitoring tool for MySQL that I have seen. It “just works.” As promised, getting up and running quickly is easy, and having a centralized location for monitoring is very useful. The graphs are beautiful and the statistics that are graphed are useful time-savers.
The biggest difference between MySQL’s Enterprise Monitor and MONyog is that MONyog is agentless. At Pythian, we have many clients with differing security requirements. Requiring a daemon process to be running is not something we currently do, and it might be a hard sell for some clients. Even if every client was amenable to it, making sure the daemon is up and running on each MySQL server is tedious. Note that agentless operation works for every feature, including log analysis as well as operating system statistics.
Installing the centralized MONyog server was a matter of installing an RPM. When I upgraded from version 2.05 to 2.5, I had to uninstall the old version and install the newer version — rpm -U did not work to update the package. It was simple to do, and I did not have dependency problems with the initial installation nor the upgrade (I had to install
compat-libstdc++ as a dependency but that installed with no problem). For the record, the error I got trying to upgrade was:
$ sudo rpm -Uvh MONyog-trial-2.5-1.i386.rpm Password: Preparing... ########################################### [100%] file /etc/init.d/MONyogd from install of MONyog-trial-2.5-1 conflicts with file from package MONyog-multi-2.05-1 file /usr/local/MONyog/bin/MONyog from install of MONyog-trial-2.5-1 conflicts with file from package MONyog-multi-2.05-1 file /usr/local/MONyog/bin/MONyog-bin from install of MONyog-trial-2.5-1 conflicts with file from package MONyog-multi-2.05-1 file /usr/local/MONyog/bin/libjs.so from install of MONyog-trial-2.5-1 conflicts with file from package MONyog-multi-2.05-1 file /usr/local/MONyog/bin/libnspr4.so from install of MONyog-trial-2.5-1 conflicts with file from package MONyog-multi-2.05-1 file /usr/local/MONyog/bin/libssh.so.2 from install of MONyog-trial-2.5-1 conflicts with file from package MONyog-multi-2.05-1 file /usr/local/MONyog/res/MONyog.res from install of MONyog-trial-2.5-1 conflicts with file from package MONyog-multi-2.05-1
I am sure this can be fixed easily, or perhaps it is fixed for regular binaries but not trial versions? Still, an upgrade should not have the error that files from the previous version conflict (feature request).
Adding New Servers
Another difference between MONyog and the other monitoring systems I have used is that adding a new server is very easy (MONyog’s terminology for “adding a server to the monitoring” is”registering a server”). Even if getting to the server requires an SSH tunnel, the option is right there on the registration screen, and you have the choice to use a public/private key pair or a password. If that wasn’t enough flexibility, you can specify any port for the SSH tunnel — some organizations run sshd on a non-standard port.
After adding several servers, clicking on checkboxes to select and de-select servers to look at became tedious. Being able to see the same graphs for a group of machines would make correlation easy, a wonderful addition (feature request). The graphs are beautiful and well-done. However, a great feature would be a link to “select all” or “select none” and even “invert selection” so if I wanted to see graphs for “all but 2 servers” I could do so easily (feature request).
In a similar vein, groups of servers would be nice (feature request). Currently I’d love to group by client; in my previous job I used Nagios and Cacti and grouped them via their function (db slaves, dbs that processed emails, etc). And of course a link to select all or none in a group, and retain the ability to view multiple servers even if they are not in the same group. I would also like to have a server be in multiple groups — with Nagios I had slaves that processed e-mails in both the “slaves” and “email” groups (feature request).
The “list of servers” screen could be two columns, and perhaps three (feature request). There is a lot of wasted whitespace because the server list spans only one column. This was very noticeable when I set MONyog up with over 20 servers. Also in the “better GUI” category is that “Register a new server” and the menu of what to “Go” to (Dashboard, Log Analyzer, Processlist, Monitor Advisor(s), and Delete Selected Registration(s) ) should be at the top of the server list — currently it is at the bottom and gets lost with a full list.
Many machines use the same ssh tunnel and the same mysql login credentials; only the mysql hostname is different. It would be nice to copy a configuration, so that adding similar hosts is easy (feature request). Many machines use the same ssh tunnel and same mysql information, with a different mysql host.
The “List of Servers” is sorted by the order they were added (their id). It would be useful to sort by name, by group and by whether or not the server has monitoring enabled (feature request).
Under “Data Collection Options” there is an option called “Data retention timeframe”. The choices are from 1-30 and seconds/minutes/hours/days. That makes it seem like the maximum data retention is only 30 days. I am not sure if it actually deletes after 30 days or if it just deletes the original data but keeps aggregates. The description is “Data collected before this timeframe is purged automatically,” so it certainly seems as though it deletes the data after this timeframe. If it does not delete all information, that should be made more clear; and if it does, I definitely want to be able to keep more than 30 days’ worth of information (feature request)!
The one problem with registering a new server is that there does not seem to be a way to specify a socket. A port is always required, even if you specify the host as “localhost”. This forced me to
GRANT privileges to
email@example.com because the already-specified
user@localhost was not being used — this was for a non-standard port on localhost.
Also, you should be able to export the list of servers for when you change/migrate MONyog servers (feature request). I actually did migrate the MONyog trial server I was using, and attempted to copy and paste the data directories but it did not work (I probably needed the metadata sqlite files too, not just the contents of the “data” directory, but some easy way to migrate servers is definitely required).
While it is fabulous that MONyog offers OS-level monitoring and graphing, currently only “Linux” is supported. I say “Linux” because I could not find a list of supported Linux distributions, so I am not sure if they actually mean every single Linux distribution (for example, is Damn Small Linux supported?). I only reviewed it on RedHat systems (Fedora, RHEL, CentOS), and confirmed that Solaris was not supported. I am sure that in the future more operating systems will be supported (including BSD-based systems) (feature request).
The processlist shows the information you would see with a SHOW PROCESSLIST. You can filter the processlist, and not just by the content of the query. According to the in-program help,
MONyog populates a SQLite table in memory with the contents of MySQL Processlist. You can issue a SELECT query on this table to filter your Processlist display. This gives you unprecedented flexibility to filter data. The table name is processlist and the column names are Id, User, Host, Db, Command, Time, State, Info and Action.
“Unprecedented flexibility” indeed! I am very excited about this. You can easily see full queries and kill threads you have privileges to kill. There are icons to copy a query to the clipboard, and to run an EXPLAIN of the query. It would be nice, however, to get a count of the processlist, as the MySQL command line client gives you at the end of a
SHOW PROCESSLIST(feature request). I would like this count to be flexible enough to include or exclude the filter so I could see how many processes match the filter as well as how many total processes there were even though I might be filtering out some (feature request).
You have the choice to auto-refresh the processlist, pause the auto-refresh, or to take a look at more queries. If you desire, you can have two web browser windows/tabs open — one to pause the auto-refresh for analysis and another to have the automatic refresh going. The interval for automatic refresh is customizable (per server). I did not change this from the default of one second, but if I wanted to default to 5 seconds, I would probably want a “edit multiple servers” feature (feature request) — currently you can only edit the settings for one server at a time.
Perhaps better than being able to filter on Time and Command (“Sleep”,”Query”,”Binlog Dump”, etc), though, is the use of color. MONyog shows you threads that have a time over 10 seconds by coloring the entire row light red. Threads that are between 5-10 seconds are yellow, and a thread between 3-5 seconds are colored very light yellow.
Log Analyzer (new)
The log analysis tools are a great way to centralize DBA tasks such at looking at slow query logs. With MONyog you have one place where you can look at slow query logs from more than one machine without having to login to multiple servers. You can choose to aggregate queries, which changes string and numeric values to “XXX”, similar to how
mysqldumpslow aggregates queries.
It is very convenient to be able to see slow queries from a certain time period. Other options for slow query log viewing are by volume (i.e., “the most recent 300MB of data”) and by number of entries (i.e., “the last 100 rows of the log”). There is not an option to view the slow query log as raw text, although you can simulate that by looking at time periods without aggregating queries.
The “Fetch Current Log Details From MySQL” feature did not work, probably because MySQL’s SHOW VARIABLES LIKE ‘log_error’ does not show the right value for some versions. In all fairness, there is a warning on the screen that some features are not available until MySQL 5.1.6. However, this means I had to update all 20 servers manually to put in logfile location details. You can specify whether the logfiles are local to the MONyog server or if they should be retrieved via SFTP. When using SFTP, MONyog automatically uses the details in the SSH server section (used to monitor OS details). These details can be different from or the same (with one click) as the SSH tunnel.
I feel as though MONyog could be “smart” about it, and deduce that if you are using an SSH tunnel, the log files are not local to the MONyog server (feature request).
The slow query log only filters based on time, not based on the content of the query. However, with the sniffer you can filter on how long a query takes as well as the content of the query (i.e., filter only for
SELECT queries). I have not used the general log analyzer but it looks similar to the sniffer and the slow query log analyzer.
It would be nice to have an “export” feature to save the filtered logs and log analyses that MONyog shows on-screen (feature request).
Some of the graph colors are not very different — blue, blue-green, grey are the first three colors. Yellow is the fourth and is a NICE contrast. I can imagine that for color-blind folks the first three colors are impossible to distinguish. This should probably be fixed with higher-contrast colors (feature request).
While it is nice not having agents, there is a consequence. If your monitoring server cannot reach the db, you will not gather data, and that data will be lost. It is often nice to know what was going on inside the server when it could not be reached from external hosts. But that is the tradeoff, and MONyog is the only agentless monitoring system I have found. (Note: I do not consider turning on SMTP as true “agentless” monitoring because you still have to change your system, and turning on SMTP can be a very dangerous security risk for folks who do not fully understand the protocol.)
MONyog does a great job of graphing status variables and formulas involving variables (such as index cache usage). I should, however, be able to
FLUSH STATUS remotely from MONyog — if this feature exists, I could not find it (feature request).
One of my pipe dreams for monitoring systems is the ability to annotate graphs and save the annotations — a load spike could be labeled as “June 2008 marketing promotion” or “annual Labor Day power outage” or similar (feature request).
The connection information, including passwords, is stored in SQLite files. Each server has a “connection.data” within MONyog’s “data/” directory. Both “more” and “strings” shows the connection information, in addition to the command line sqlite3 interface to SQLite). These files are world-readable by default, so anyone with access to the machine can read them. In general there should probably be a more fine-grained control — right now I installed MONyog using
sudo, and the default is to run as root.
I am not sure what technology(ies) MONyog embeds internally to run its web service (default port is 5555), but Apache has had its share of vulnerabilities…..If MONyog ever did get compromised, it would be best if it was running as a non-root user. This should be automated by the install, or at least be an option (feature request). And certainly the “connection.data” files should *not* be world-readable (feature request).
On my first install I had tried to use
yum to install the locally downloaded file, and got the error
Package MONyog-multi-2.05-1.i386.rpm is not signed
When I used
rpm to install, there was no such issue. This is probably a yum issue, but I figured I would note it here.
I was able to define 20 servers. I did find that occasionally MONyog had stopped, but there was no indication it had crashed (i.e., core files) and nothing in the logs to indicate whether it was shut down normally or crashed. Nobody else was using the test machine so chances are it had crashed. I was not able to assess how many machines might make MONyog crash.
Note that I was using a test server for this. This test server has 630 Mb of memory and 1G of swap, so I would not accuse MONyog of being unstable. It is very possible that MONyog would not exhibit this behavior on a server that had production-level resources.
I did not use the notification feature with MONyog. I was mostly interested in the graphing functionality and log analysis. As well, we already have paging for our systems, and I did not want spurious pages. However, given the quality and flexibility of the other parts of MONyog as well as the settings I would have used to set up notifications, I am sure the notification feature will meet the needs most folks have for alert notifications.
Even though I did not use the notification feature, I did look at the options, so I will go into what those are. You can specify whether to e-mail each data collection or alerts only. You can specify a “minimum notification interval” which I believe is useful both for alerts (e.g., page once, and then do not page again until 15 minutes later) and for data collection (e.g., send one data collection per day) Notifications are via e-mail only. I did not see if I could specify more than one e-mail address, so if that is not possible consider this a feature request (feature request). There are plenty of other features and flexibility that the MONyog system could use, including paging thresholds, paging escalations, and dependencies (feature request).
MONyog allows for custom OS and MySQL system and status variables to be checked. From the Tools->Customize page:
The OS checks (memory, load, etc.) are also exposed.
As far as I know, however, MONyog is closed source and does not allow for custom checks to be written. A lot of the power of Nagios, Cacti, and even MySQL’s Enterprise Monitoring System is that custom checks can be written by anyone. I do not mind that MONyog is closed source; their product is wonderful and I would willingly pay for it. I cannot imagine that a simple API for custom checks is not on their roadmap; but if it is not, it should be (feature request).
The one major area where MONyog needs a lot of work is logging and troubleshooting. First, if a server is not collecting data, even if notifications are turned on, the list of servers should have some kind of “green light/OK vs. red light/BAD” icon/coloring/whatever to show whether or not a server is running.
The dashboard give a sort of “green light/OK vs. red light/BAD” list, but the dashboard shows wide graphs, and puts all the selected hosts side-by-side on the page. Having to scroll to the right is very annoying, and I find I have to scroll to the right if I selected more than four servers. That is certainly not “at-a-glance”, which I would expect something called “Dashboard” to be. Obviously, if I had notifications turned on I would know which servers were having issues. But even when notifications are turned on, if I get a pager storm I might not be able to easily keep in my head which servers are up and which are down.
Ironically, the “Monitors/Advisors” is a better dashboard than “Dashboard” is, though it still has the issue of horizontal scrolling. There is no good way to get rid of the horizontal scrolling issue because each host has many items displayed. This means it is only practical to compare a few servers at a time. This is not a problem, as this is what is usually done. Some graphing systems allow one graph to show multiple hosts (feature request) (The systems that allow this also allow groups, so requesting this feature may depend on the group feature previously mentioned).
Perhaps a name change from “Dashboard” would help? The “Dashboard” shows MySQL connections, MySQL cache hit rate, MySQL statements, OS System availability, and CPU Usage. Maybe “Quick Stats”?
Troubleshooting is very difficult because the logging is very, very poor. This is the one place where I am actually disappointed — every other feature has actually wowed me because the folks at Webyog have done their job extremely well.
From the logs:
[2.5 Beta 2] [2008-06-23 17:00:21] lib/webyog/src/ymysql.cpp(373) ErrCode:5001 ErrMsg:Tunnel to MySQL server was not created successfully. [2.5 Beta 2] [2008-06-23 17:00:24] lib/webyog/src/ysshsession.cpp(228) ErrCode:1 ErrMsg:Authentication by password failed. Access denied. authentications that can continue : publickey,gssapi-with-mic,password [2.5 Beta 2] [2008-06-23 17:00:24] lib/webyog/src/ytunnel.cpp(207) ErrCode:-1 ErrMsg:SSH2 Connect and Authorize failed: Authentication by password failed. Access denied. authentications that can continue : publickey,gssapi-with-mic,password [2.5 Beta 2] [2008-06-23 17:00:24] lib/webyog/src/ymysql.cpp(295) ErrCode:5001 ErrMsg:Tunnel to MySQL server was not created successfully.
While I am glad to see the error in the error log, I have no idea which of the 20 servers I configured is having this error. There is a “test connection” button in the server properties, but a password had changed on me and I had no idea which machine broke. Hence my comment above that there should be a real “dashboard” to see very quickly which servers are not currently gathering data — and the ability to sort by that would be nice too (feature request). I asked above to be able to sort by whether data collection was enabled or disabled; this is separate from having a server that has data collection enabled but not working.
Some more errors I received that need more information to help me determine “which server is this?”:
lib/webyog/src/ymysql.cpp ErrCode:1045 ErrMsg:Access denied for user 'user'@'host' (using password: YES)
user and host are the actual user and host, I’ve changed them here for privacy reasons. Sometimes I can guess which server this problem is on by the user@host value, but other times when the user is “pythian@localhost” or “firstname.lastname@example.org” I have no idea which machine it is.
lib/webyog/src/ysftp.cpp ErrCode:-1 ErrMsg:Connection Failed: socket : Too many open files
Is this too many open files on the MONyog server (maybe why it crashed for me?), or too many open files on the MySQL server?
lib/webyog/src/ysshsession.cpp(97) ErrCode:2 ErrMsg:Error while connecting to ssh server: Failed to resolve hostname (Name or service not known)
This is copied verbatim from the log — it actually says “hostname” in there….I have no way of knowing which hostname failed to resolve. This was likely a transient error too, so knowing which server failed to resolve would be nice (feature request). I could not find any parameters to make logging more (or less) verbose; once more information is being put into the log this would be a good to have (feature request).
For the following log entries, again, knowing which server would be useful, as would knowing which check caused. Given the first error, the check that caused it was probably a replication check using
SHOW SLAVE STATUS; the second was probably caused by a log analysis check, as that is what sftp is used for. Logs should be more useful than this.
MONyog/connectionmgr.cpp ErrCode:1227 ErrMsg:Access denied; you need the SUPER,REPLICATION CLIENT privilege for this operation lib/webyog/src/ysftp.cpp ErrCode:-1 ErrMsg:Could not open file '/var/run/mysqld/mysqld.pid': sftp server : No such file lib/webyog/src/ytunnel.cpp ErrCode:-1 ErrMsg:SSH2 Connect and Authorize failed: Error while connecting to ssh server: Connecting : Connection refused
The following log entries are entries that were not helpful, giving no way for me to fix the problem. In the following, I do not even know what the error is, precisely:[Note that each line is its own entry in the MONyog log, and is not necessarily related to the previous or next line in this block.]
lib/webyog/src/ylibssh2.cpp ErrCode:-1 ErrMsg:Error while libssh2_session_startup. lib/webyog/src/ylibssh2.cpp ErrCode:-1 ErrMsg:Error while writing in the channel. lib/webyog/src/ysftp.cpp ErrCode:-1 ErrMsg:Error setting SSH options lib/webyog/src/ysftp.cpp ErrCode:-1 ErrMsg:SSH channel not initialized. MONyog/populatemysql.cpp(78) ErrCode:-1 ErrMsg:ConnectToMySQL lib/webyog/src/ysqlite.cpp ErrCode:5 ErrMsg:database is locked lib/webyog/src/ymysql.cpp ErrCode:1018 ErrMsg:Can't read dir of '.' (errno: 24) MONyog/snapshotmgr.cpp ErrCode:-1 ErrMsg:BEGIN IMMEDIATE TRANSACTION MONyog/snapshotmgr.cpp ErrCode:-1 ErrMsg:END TRANSACTION MONyog/snapshotmgr.cpp ErrCode:-1 ErrMsg:ROLLBACK TRANSACTION lib/webyog/src/ysqlite.cpp ErrCode:1 ErrMsg:SQL logic error or missing database
Though I have mentioned a lot of poor logging, there were many lines in the log that
The worst part, logging and troubleshooting, was saved for last. Because I really like MONyog and do not want to have people having a bad taste in their mouth, I will close with what I opened with:
As an overall review — MONyog is the best out-of-the-box GUI monitoring tool for MySQL that I have seen. It “just works.” As promised, getting up and running quickly is easy, and having a centralized location for monitoring is very useful. The graphs are beautiful and the statistics that are graphed are useful time-savers.[Very special thanks go to Rohit Nadhani and Peter Laursen of Webyog for letting me test their software so I could write their review. After reviewing the software, I definitely recommend MONyog to anyone in a DBA role as worth every penny.]
Interested in working with Sheeri? Schedule a tech call.