From the category archives:

web development

HipHop for PHP: Move Fast

by admin on February 2, 2010

One of the key values at Facebook is to move fast. For the past six years, we have been able to accomplish a lot thanks to rapid pace of development that PHP offers. As a programming language, PHP is simple. Simple to learn, simple to write, simple to read, and simple to debug. We are able to get new engineers ramped up at Facebook a lot faster with PHP than with other languages, which allows us to innovate faster.

Today I'm excited to share the project a small team of amazing people and I have been working on for the past two years; HipHop for PHP. With HipHop we've reduced the CPU usage on our Web servers on average by about fifty percent, depending on the page. Less CPU means fewer servers, which means less overhead. This project has had a tremendous impact on Facebook. We feel the Web at large can benefit from HipHop, so we are releasing it as open source this evening in hope that it brings a new focus toward scaling large complex websites with PHP. While HipHop has shown us incredible results, it's certainly not complete and you should be comfortable with beta software before trying it out.

HipHop for PHP isn't technically a compiler itself. Rather it is a source code transformer. HipHop programmatically transforms your PHP source code into highly optimized C++ and then uses g++ to compile it. HipHop executes the source code in a semantically equivalent manner and sacrifices some rarely used features — such as eval() — in exchange for improved performance. HipHop includes a code transformer, a reimplementation of PHP's runtime system, and a rewrite of many common PHP Extensions to take advantage of these performance optimizations.

Scaling PHP as a Scripting Language

PHP's roots are those of a scripting language, like Perl, Python, and Ruby, all of which have major benefits in terms of programmer productivity and the ability to iterate quickly on products. This is compared to more traditional compiled languages like C++ and interpreted languages like Java. On the other hand, scripting languages are known to generally be less efficient when it comes to CPU and memory usage. Because of this, it's been challenging to scale Facebook to over 400 billion PHP-based page views every month.

One common way to address these inefficiencies is to rewrite the more complex parts of your PHP application directly in C++ as PHP Extensions. This largely transforms PHP into a glue language between your front end HTML and application logic in C++. From a technical perspective this works well, but drastically reduces the number of engineers who are able to work on your entire application. Learning C++ is only the first step to writing PHP Extensions, the second is understanding the Zend APIs. Given that our engineering team is relatively small — there are over one million users to every engineer — we can't afford to make parts of our codebase less accessible than others.

Scaling Facebook is particularly challenging because almost every page view is a logged-in user with a customized experience. When you view your home page we need to look up all of your friends, query their most relevant updates (from a custom service we've built called Multifeed), filter the results based on your privacy settings, then fill out the stories with comments, photos, likes, and all the rich data that people love about Facebook. All of this in just under a second. HipHop allows us to write the logic that does the final page assembly in PHP and iterate it quickly while relying on custom back-end services in C++, Erlang, Java, or Python to service the News Feed, search, Chat, and other core parts of the site.

Since 2007 we've thought about a few different ways to solve these problems and have even tried implementing a few of them. The common suggestion is to just rewrite Facebook in another language, but given the complexity and speed of development of the site this would take some time to accomplish. We've rewritten aspects of the Zend Engine — PHP's internals — and contributed those patches back into the PHP project, but ultimately haven't seen the sort of performance increases that are needed. HipHop's benefits are nearly transparent to our development speed.

Hacking Up HipHop

One night at a Hackathon a few years ago (see Prime Time Hack), I started my first piece of code transforming PHP into C++. The languages are fairly similar syntactically and C++ drastically outperforms PHP when it comes to both CPU and memory usage. Even PHP itself is written in C. We knew that it was impossible to successfully rewrite an entire codebase of this size by hand, but wondered what would happen if we built a system to do it programmatically.

Finding new ways to improve PHP performance isn't a new concept. At run time the Zend Engine turns your PHP source into opcodes which are then run through the Zend Virtual Machine. Open source projects such as APC and eAccelerator cache this output and are used by the majority of PHP powered websites. There's also Zend Server, a commercial product which makes PHP faster via opcode optimization and caching. Instead, we were thinking about transforming PHP source directly into C++ which can then be turned into native machine code. Even compiling PHP isn't a new idea, open source projects like Roadsend and phc compile PHP to C, Quercus compiles PHP to Java, and Phalanger compiles PHP to .Net.

Needless to say, it took longer than that single Hackathon. Eight months later, I had enough code to demonstrate it is indeed possible to run faster with compiled code. We quickly added Iain Proctor and Minghui Yang to the team to speed up the pace of the project. We spent the next ten months finishing up all the coding and the following six months testing on production servers. We are proud to say that at this point, we are serving over 90% of our Web traffic using HipHop, all only six months after deployment.

How HipHop Works

The main challenge of the project was bridging the gap between PHP and C++. PHP is a scripting language with dynamic, weak typing. C++ is a compiled language with static typing. While PHP allows you to write magical dynamic features, most PHP is relatively straightforward. It's more likely that you see if (...) {...} else {..} than it is to see function foo($x) { include $x; }. This is where we gain in performance. Whenever possible our generated code uses static binding for functions and variables. We also use type inference to pick the most specific type possible for our variables and thus save memory.

The transformation process includes three main steps:

  1. Static analysis where we collect information on who declares what and dependencies,
  2. Type inference where we choose the most specific type between C++ scalars, String, Array, classes, Object, and Variant, and
  3. Code generation which for the most part is a direct correspondence from PHP statements and expressions to C++ statements and expressions.

We have also developed HPHPi, which is an experimental interpreter designed for development. When using HPHPi you don't need to compile your PHP source code before running it. It's helped us catch bugs in HipHop itself and provides engineers a way to use HipHop without changing how they write PHP.

Overall HipHop allows us to keep the best aspects of PHP while taking advantage of the performance benefits of C++. In total, we have written over 300,000 lines of code and more than 5,000 unit tests. All of this will be released this evening on GitHub under the open source PHP license.

Learn More this Evening

This evening we're hosting a small group of developers to dive deeper into HipHop for PHP and will be streaming this tech talk live. Check back here around 7:30pm Pacific time if you'd like to watch.

As I'm sure there will be plenty of questions, starting this evening take a look at the HipHop wiki or join the HipHop developer mailing list. You'll also find us at FOSDEM, SCALE, PHP UK, ConFoo, TEK X, and OSCON over the next few months talking about HipHop for PHP. We're very excited to evolve HipHop into a thriving open source project along with all of you.

Haiping Zhao, a senior engineer, has found Facebook to be a programmer's paradise.

{ 0 comments }

Facebook Connect Made Easier for Site Owners

by admin on September 30, 2009

Facebook has announced the launch of the Facebook Connect Wizard and Playground. These are tools that are designed to make it easier for website owners to integrate Facebook Connect into their sites.

Facebook says the benefits of Facebook Connect include increased traffic and engagement. "Establishing a presence on the social Web requires fundamental building blocks. Facebook provides these essential tools, including identity for a great registration system, and immediate access to 300 million active global users," says Facebook platform engineer Alex Himel. "Facebook Connect gives entrepreneurs of all sizes -- and with varying developer resources -- the ability to build traffic efficiently through reaching a relevant audience, while offering an engaging user experience."

Facebook Connect Wizard

Of course the more sites that utilize Facebook Connect, the more traffic and engagement Facebook itself gets as well. The concept is not unlike Google's Place Pages (among other products) in that regard. A reader recently commented, "Google wants people to spend more time on Google. Yahoo wants people to spend more time on Yahoo. Facebook wants people to spend more time on Facebook. Several of these large online "media" [companies] are doing everything in their power to keep the eyeballs on their website..."

Like Google's Place Pages, however, Facebook Connect does have the potential to benefit businesses in some ways. "From making the registration process easier for users, to bringing friends together, to gaining distribution from sharing back to Facebook, there are many benefits that come along with Facebook Connect, and we're focused on helping you optimize your website and service to provide a more social experience for users," says Himel.

The Facebook Connect Wizard consists three steps:

1. Enter Basic Info about your site
2. Upload file
3. Social Markup

The Playground provides code samples for adding profile pictures, user names, and friends to your site. Facebook says they will continue to add more code samples as they see more usage.

Earlier today, Facebook announced translations for Facebook Connect, which allows sites and apps to be translated into over 65 languages.

{ 0 comments }

Google Developers Produce New Programming Language

by admin on September 17, 2009

A new programming language that runs on the Java Virtual Machine is available thanks to a couple of Google's developers.  Called Noop (pronounce it like an abbreviated version of "no operation"), the developers claim that it combines the finest aspects of other languages and attempts to guide users towards accepted best practices.

Other parts of the new Noop homepage (which is hosted by Google Code) explain that Noop "in source form looks similar to Java.  The goal is to build dependency injection and testability into the language from the beginning, rather than rely on third-party libraries as all other languages do."

Then, "Immutability and minimal variable scope are encouraged by making final/const behavior the default and providing easy access to a functional style.  Testability is encouraged by providing Dependency Injection at the language level and a compact constructor injection syntax."

The Noop website is pretty well built out if you'd like more information.  Google's developers provided all sorts of details about the current state of things and where Noop may go, as well as a place or two in which folks can give feedback.

It'll be interesting to see what sort of traction this Google-y venture into programming languages gets.  Reactions seem a bit muted so far.

Hat tip goes to Darryl K. Taft.

{ 0 comments }

Google Summer Of Code Reflections

by admin on August 27, 2009

Yesterday, we reported that this year's Summer of Code had come to an end.  Google provided a lot of statistics about the program, and the whole thing sounded very impressive.  Today, we got the chance to talk to a student who was actually involved.

Shashank Agrawal is enrolled in a dual degree program at the International Institute of Information Technology, Hyderabad.  During Google's Summer of Code, he worked with Sakai on a collaboration and courseware management platform.

The specific project's quite interesting.  Agrawal explained that it "will enable users to edit web-pages inline," or in other words, it "will allow users to modify web-pages from client side while they are viewing it."  You can't get more user-friendly than that, and a lot of applications seem possible.

Agrawal also picked up a lot while working on it.  He wrote, "I learnt how an open-source product is developed step-by-step, how people help each other out, how they manage to be in sync despite being geographically distant, how tasks are planned and executed, how issues are resolved; and the tools which help in managing all these processes. . . .  And above all this, I learnt how to work collaboratively."

As for the Summer of Code program as a whole, Agrawal came away equally impressed.  He wrote, "SoC exposes you to the indispensable open-source communities which are doing tremendous useful work."  Plus, "Students who successfully complete GSoC definitely have an edge over others during interviews for jobs.  It makes a significant contribution to the resume."

{ 0 comments }