Week 10, 2019: AI & Performance

Humans Who Are Not Concentrating Are Not General Intelligences

This is a fun article with a somewhat scary realization. As AI & Machine Learning advances, computers can generate more and more seemingly-realistic content. In this post, the author found that although texts generated from OpenAI’s latest language model look real to people who skim them, they are quite unlogical to those who pay close attention. This reminds me of all the buzzes around fake news & the US election a while back. Also, in case you are not aware, the general IQ of the human race has been going downhill for a while now.

Meanwhile, it’s arguably difficult to judge a person’s IQ via simple daily interactions. Not every job requires one to be smart and neither being clever alone an indicator of success. I guess there are much more to human existence than mind & matter.

MySQL Challenge: 100k Connections

An article from an engineer at Percona (the company whose staff built & later sold the original MySQL server to Sun Microsystem). Optimisation at this level is not something we see on a daily basis at most companies but nevertheless a joy to read about and perhaps someday try out ourselves. While not trivial from the post, there were a lot of optimisations happened in the software stack to make this seemingly impossible feast, especially where the test machine handling this monster workload only has 64GB of RAM and 2.2GHz CPU cores.

Notes on the Amazon Aurora Paper

Have you ever wondered how AWS Aurora, the magical MySQL Engine on AWS, works? I did. This post summarizes key ideas behind what made Aurora the infinitely scalable DB engine it is. In Aurora, the log IS the database. Therefore, any write operations are always O(1) and point-in-time recovery is also seamless. One day, we may end up with a highly scalable system and I think this storage design may be applicable to the high-throughput components. By the way, the original paper is quite easy to digest (compared to the typical PhD papers that I loved to hate during my time as a research student) and is a recommended read if you have time.

Training ImageNet in 1.5 Minutes

A team of AI researcher developed a Deep Neural Network (DNN) that can train ImageNet, a very popular image recognition dataset at a speedup of 410x (yes, you read that right, over 400 times faster). This level of performance is truly ground-breaking considering previous attempts only achieved a 2.5x speedup, even though the domain was quite mature in recent years in terms of accuracy. I feel a bit nostalgic about this news because image processing with DNN was my area of focus during my AI course.

On another side note, I run an AI-related project on the side and can see this development being part of a SaaS solution in the near future. The network is still gonna be an issue but hopefully, the speed gain will still be significant enough to matter in spite of the bottleneck.

And one more thing…

Seven Myths in Machine Learning Research

The title is self-explanatory. There is also a TLDR; at the beginning so I won’t go into further details here. Fun note: when I first started with machine learning, I was also tricked by Myth #1.

XKCD-style plots in Matplotlib

Who can resist an XKCD comic? I can’t. This guy built a feature for Mathplot (arguably the most commonly used visualisation library among data scientist) to make graphs that seem to come from an XKCD comic. Loved it!

Polar

Polar is a tool to manage your reading. Yes, you read it right, reading tool. This is particularly useful for keeping track of the latest scientific research in your field of interest. Although not quite what it’s created for, you can also use polar as a repository of long-form articles and other web contents. I wish something like this existed when I was doing my research degree, it may have saved a few (or a lot of) sleepless nights going through notes.

That’s is for this week. See you next week!

Coming up next: Testing Testing Testing