245681 items (663 unread) in 24 feeds
CNN
(6 unread)
MSNBC
(16 unread)
PHP
(57 unread)
Deals
(573 unread)
Tech
(3 unread)
Web Development
(1 unread)
CNN Money
(6 unread)
Frugal Blogs
(1 unread)
Anyone who knows me knows that when I talk about the model, I’m usually talking about Propel. I’ve liked Propel ever since I started working with it in the middle of last year; I personally find it easier and more fun to use than Doctrine or other ORMs available today. I was excited to see recently that Propel’s development team had released Propel 1.5 as a beta, with a launch of the new features to come soon.
There are a couple new features in Propel 1.5 that I think are going to be pretty awesome additions. Here are my two favorites:
Collections and On-Demand Hydration
One of the things you’d sometimes have to do with Propel is fetch an array of objects and iterate through that array to find the data you wanted. However, this created a significant concern: it actually had to hydrate these objects all at the same time. This means that if you had a limit of, say, 200,000, you could quite possibly run out of memory.
Propel 1.5 adds the ability to have the Collection object hydrate the objects on demand, rather than all at once. This saves memory until the object is actually requested. Propel is smart enough not to need to run back and forth to the database; it still has access to the results set, but doesn’t hydrate the objects until it needs them.
Model Queries
Prior to Propel 1.5, in order to query the database in any unusual way (say, attaching an unusual WHERE clause for example), you had to use a Criteria object. Criteria objects were exceptionally useful, but had certain limitations, the largest one being that they are unaware of the way the model is constructed, and therefore cannot offer additional help at building queries.
With the introduction of Propel 1.5, there is a new set of classes automatically generated called Query classes. These classes extend the Criteria class, incorporating it’s functionality while offering helper methods that make writing common queries easier. For example, if you’re searching for a user you can find that user more easily because there will be a method named findByUsername() that you can access.
These features, along with lots of other cool components, make Propel 1.5 pretty awesome. I’m looking forward to it’s stable release and to integrating it into my development in the near future.
Recently, php|architect announced that they were extending the early bird pricing for the TEK-X conference being held this year in Chicago, IL. As someone who has been and will be going this year, this conference represnts a great opportunity for anyone who hasn’t gone to a PHP conference to attend one.
There are some good reasons that you should be attending.
Before I attended a conference, I had no idea what others were thinking with regards to PHP. The first big conference I attended was ZendCon and it was there that I discovered an entire world that I knew nothing about – the greater PHP community. But more than that, I discovered Really Smart People(tm) that I could interact with, be friends with, and who shared my passion for technology.
If you’ve put off attending a PHP conference, or simply are on the fence, you should seriously consider attending TEK-X this year. The conference itself has a great lineup, but I also know several people who are going to be there that are bright, talented and very enthusiastic about the PHP world.
That’s a great reason to go.
There’s been lots and lots of discussion regarding the Facebook “Hyper PHP” release of HipHop for PHP. This new technology is an in-production converter for PHP that takes PHP code, converts it into C++ code, and creates a complete binary that can be run on a server natively. Facebook claims improvements of up to 50%, and their model represents a shift in thinking about scripting languages like PHP.
Ostensibly, lots of people are going to be thinking about how this will benefit them and their organizations. Here are some thoughts on who will benefit and who will not benefit.
Who benefits
There are lots of benefits for producers of large-scale web applications written in PHP. The ability to convert PHP code into something else that is more efficient without having to retrain their developers will ultimately reduce server load and deployment costs.
In particular, anyone running a web application with more than two servers will certainly benefit, as there is likely the traffic necessary for such a large web application. With a 50% improvement, one could conceivably use only one server; however, for redundancy reasons, it’s likely a good idea to maintain at least two web servers.
Others who benefit include those who distribute PHP code to clients directly. The ability to deploy an application on a web server in a compiled format ensures that reverse engineering is more difficult and protects intellectual property. There is considerable debate as to whether or not compiled or obfuscated code is worthwhile; given that the product designed by Facebook must be efficient, the idea of distributing a PHP-based binary makes a lot more sense.
Who doesn’t benefit
For all the benefits, there is a large group of PHP developers that won’t benefit: the average, every day developer. There are a vast number of websites that have a single server and not very much traffic; there are also lots of sites that run on shared hosts, or sites that have multiple sites on the same server. Since HipHop runs one web application at a time (with a built-in webserver), it’s impossible at the moment to run more than one website on a box. This is obviously less than optimal for most folks, and will limit adoption.
Also, early adopters of PHP 5.3 won’t be able to use HipHop straight away. Currently, it supports PHP 5.2 (as most of Facebook is written in PHP 5.2), though there are plans to modify HipHop to support PHP 5.3.
Bottom Line
HipHop is still pretty awesome, and will certainly change the way PHP applications are developed. There are great opportunities here, and Facebook has certainly developed a revolutionary system. What will ultimately happen remains to be seen; regardless, it should be an exciting year for PHP development overall.
This entry isn’t about PHP.
It’s about where those of us who develop for a living, PHP or otherwise, see ourselves in the future.
From time to time I do some soul searching and think about where I’ve been and where I’m going. With a tough few months in the books, I wanted to take some time and figure out my next moves, and where I’m heading.
Soul searching isn’t easy, because it requires us to examine ourselves, and in particular, our mistakes, to decide whether or not what we’re doing is appropriate and good. Soul searching means reexamining every hard decision that we made and thinking about whether or not we decided rightly. And it means taking a hard look at where we said we were going, and whether or not we actually got there.
I’ll be doing some soul searching this weekend as I think about the direction I want to go. And then I’m going to employ my resources to get there.
Over time, the PHP DateTime object has become one of the best objects available to PHP developers. This object has grown since early PHP 5 into a robust class that has the ability to do lots of great things.
Recently, I was exploring some of the functionality provided by the DateTime object as of PHP 5.3 (and wishing that Ubuntu had PHP 5.3 as a package distribution). Here are some of the new things in PHP 5.3 that are really cool.
Note: you can read the manual on the DateTime object here.
DateTime::add() and DateTime::sub()
The add() and sub() methods are about adding or subtracting the number of days, months, years, etc. from a DateTime object. The interface is a bit clunky, requiring you to pass in a DateInterval object. However, this still provides an easy way to modify a DateTime object.
For example, let’s say we wanted to add 3 weeks to our DateTime object:
<?php
$dt = new DateTime(); // Set to now.
$dt->add(new DateInterval('PW3'));
echo $dt->format('n/j/Y'); // Outputs 3 weeks from today's date.
?>
How is this an improvement over using the DateTime::modify() method? It improves on it in one specific way: it’s object-oriented. Rather than passing a string you have the ability to pass an object.
DateTime::diff()
One of the coolest PHP 5.3 features introduced was the ability to diff two DateTime objects. This returns to you a DateInterval object, which contains the details of how different the objects are.
$dt1 = new DateTime('August 3rd, 2004');
$dt2 = new DateTime('August 10th, 2006');
var_dump($dt1->diff($dt2));
The result that you get looks like this:
object(DateInterval)[3]This can be extremely useful in determining the time difference between two objects.
DateTime::getTimestamp() and DateTime::setTimestamp()
Sometimes it’s just useful to be able to grab the Unix timestamp from the DateTime object. But prior to PHP 5.3, to do so required some clunky code using strtotime() and a formatted date string. PHP has fixed this, and you can now use these getter and setter methods to get the Unix timestamp.
There are lots of projects heading over to Git these days. It’s not hard to see why: Git offers great merging support, distributed version control, and a great playground. Spots like Github even offer centralization crucial to large open source projects. But when it comes to the corporate world, Git may not be ready for prime time.
Corporate America needs a centralized version control system. Subversion still offers this: Subversion centralizes the repository and simply checks out a working copy (versus Git, which gives you a complete repository). Corporate America still needs to have cannonical version numbers, and the ability to see the progress of a product over time as a single line – not a bunch of branches and independent repositories.
Git is a great piece of software. It is fantastic for distributed version control. It is my opinion that when it comes to corporate work, Subversion will still continue to win out
Shipping code that works is crucial to retaining the support of customers and high quality in your application. While it’s impossible to ship code without any bugs at all, it is possible to control for as many as possible, and fix as many known issues as there is time. These strategies are designed to ensure that code works when it is shipped to the end user.
Employ testers.
Developers have a tendency to test their code only with expected data. Testers, on the other hand, aren’t developers themselves; instead, they will use data that you don’t expect and find bugs that your users might otherwise experience.
Hiring testers is a tough sell in many development teams, especially small ones. It is possible to have testers that have other functions – that is, they might be in another department or moonlight as testers. But with teams larger than 5 developers, having a full time tester is a crucial component of good development practices.
Write unit tests.
Every developer makes mistakes at some point. Having unit tests in place will help find these mistakes by showing you where a class breaks. It makes refactoring easier as well, since you can refactor and know that if your unit tests pass, there’s a good chance that you did it properly.
Unit testing should be built into the process of code development from the beginning of a project. However, if you’re starting from someone else’s project and the project doesn’t have unit tests already, simply institute a process of fixing bugs after you’ve written a unit test that identifies the bug. Eventually you’ll have unit tests for most of the application.
Write functional tests.
Unit tests are great, but they’re not enough. Knowing that one function takes an array and creates an object is fine, but what happens with the next function, and the one after that? Introducing functional tests: testing against expected behavior.
There’s a subtle difference between these two concepts. Unit tests test a specific component of the code: a single method, function, or clause. Functional testing, on the other hand, tests expected behavior: does clicking that button actually result in a refreshed page? Does my controller actually invoke the action properly? More than one method might be acted upon with functional testing.
A lot of this testing is done by the testers; however with applications like Selenium you can conduct some automated functional tests. On a small team that doesn’t have testers, or on a large team where there might be a challenged set of resources, automated functional tests can help reduce the testing burden.
Work unit testing into your build process.
We talked about integrating your build process with a continuous integration server. With build engines like Phing, it’s possible to automate the unit testing process (and even the functional testing process to some degree).
Each time you make a build for release, you should know that all the unit tests pass. If they don’t, there’s a problem that should be addressed before the build is completed.
Use continuous integration to know when tests began failing.
To hit on the same theme, a solid continuous integration server will automatically run your tests and alert you as soon as the first one fails. This helps prevent regression – the introduction of bugs into code that worked in previous releases. The time to discover regression isn’t when the build is due and the team is ready to go, it’s right after a commit, and continuous integration will help with this.
Often when I’m on a job interview, I’ll ask whether or not the company I’m talking with makes use of an automated build system of any kind. More often than not, the answer I get is somewhere along the lines of “build systems are irrelevant to the web; we can simply upload changes instantly.”
This thinking could not be farther from the truth. Build systems are just as relevant to the web (if not more so) than they are to compiled code. Build systems offer significant advantages to the development of software applications, and it is crucial that developers not take them for granted.
Here are my reasons for wanting a build system for web applications:
Build systems make continuous integration easier.
Having a build system is a critical component of a continuous integration server. For those who don’t know, continuous integration is a process whereby documentation is automatically created, the build is assembled, and the tests are run (among other things). This is a great way to ensure that the build is always “shippable” – that is, the unit tests pass and the documentation is ready to go.
Continuous integration requires a build system to run unit tests and create a shippable version of your software. Having a build system in place makes continuous integration that much more reasonable, and useful.
Build systems ensure that the same process is followed each and every time when making a release.
Ask four different developers how they release a web application and you’ll get four different answers. Even having documentation that outlines the steps, you’ll get slight deviations in the process. That’s the way human beings are; we’re unique, different, and individual. But the build process shouldn’t be.
An automated build process ensures that each and every time the application is built, the same process is followed. That might stand out as a “duh!” but the reality is that following the same process is absolutely crucial, especially when releasing software to a large audience. The potential for human error must be mitigated, lest a mistake get rolled into a release.
With a one-step system, errors are reduced.
An automated build system can still be flawed, if it requires more than one step to complete. This additional step or set of steps allows for the possibility of human error. But with a one-step build system in place, human error is almost reduced to zero.
Joel Spolsky makes a great case for why you should employ a one-step build system and I support him 100% in his assertions. Having come from an environment where creating releases was something I did, I do know that when you’re ready to create a shippable build, human error can creep in to any system that has more than 1 step.
Creating a build system reenforces the entire development process.
With a build system in place, there’s a formality to the development process that wasn’t there before. Code that is committed doesn’t simply go into the ether; instead, it is compiled into the latest build, tested, and documented. Developers relying on continuous integration see the immediate impact of their code. And since the build process probably includes testing (it should, anyway), there’s a formal process to getting code into the product.
Requiring automated builds and a build process ensures that the development process has a set of standards that must be followed when software is released. This is crucial to ensuring that the development cycle has a rhythm to it.
A build system introduces formality to the release process.
With a build system in place, when it comes time to make the release build, there’s a formality to the release that is different than simply “uploading the changes.”
I’ve found that when I prepare to “do a release” I am that much more careful, even if I’ve done a release a hundred times before. Why? Because my name is on it, and I have a process I must follow. It’s not something I do each and every day; it’s special.
When all you do to make a release is upload files to a web server, there’s nothing special about that. There’s nothing unique. In fact, you probably do it with some regularity. You’re not replacing the entire application with a new version; you’re just replacing a few files with some other files. This isn’t release management.
Build systems are still relevant. They haven’t gone out of style, and they still have a place in the development of applications, including applications that are compiled at runtime. There are lots of build systems out there, Phing being one of the more popular PHP-based ones for PHP projects.
Lots of marketing students and sales professionals each year are required to read the book How To Win Friends And Influence People and for good reason: the book stands alone as one of the greatest books on sales ever. I decided to co-opt the title of that great book for this entry, because I want to talk about how to sell your company to developers – particularly, how to get the best developers to do the best work and make your company, well, the best at whatever it is that you do.
Developers are not interchangeable.
If there’s one thing that’s absolutely frustrating about human resources types, it’s their insistence that problems can be solved by adding more people. To be fair, their role is to provide the human capital needed to get the job done; however, at the end of the day developers cannot be changed out like spare parts in an engine.
Software development is an art, and like any art, the artist is the most crucial component determining the outcome of the artwork. Lose a good developer and your masterpieces (your products) will suffer for it; you cannot simply replace excellent talent with anyone like you can replace someone who fries hamburgers at McDonald’s.
The Mythical Man-Month touches on this with a considerable degree of competency. Because the job of developing software requires considerable thought, it’s not something that can be done by X Developer or Y Developer, but must be done by specialized individuals who are right for the particular job.
Use the best tools, whether or not they’re free.
When asking companies questions from the Joel Test, I used to ask his question about whether or not the company uses the best tools money can buy. Then came a creative answer from one company, which was “we try, but we’ve found that some of the best tools are free.” This company had made a concerted effort to find the best tools for the job: they were willing to pay for the best tools if they cost money, and used the free ones, if they were the best. This is an excellent policy.
Nothing frustrates developers more than using substandard tools. A free bug tracker isn’t useful if it’s not user friendly or easy to comprehend. An open source Subversion browser isn’t worthwhile if it doesn’t do what the team needs. Conversely, paid applications are not better simply because they cost something; there are plenty of development tools that are free and are better than the paid ones (lots of people argue over Netbeans versus Zend Studio, for example).
Companies should be willing to spend in the areas where it matters: technology, software, and “toys” – systems that impress their employment candidates and developers. They should save resources when it’s appropriate to do so, and spend when the situation calls for it. As Joel Spolsky points out, a cool computer system can woo a developer even if you don’t pay the highest salary.
Nothing can make up for bad managers.
The nature of software development, particularly the fact that it’s closer to an art than a science, makes it a very difficult area to manage. It’s not well understood by other departments, and developers tend to keep to themselves. Thus, managers of developers are often unsure about how to manage development teams, and other departments tend to blame the developers (because humans place the blame on that which they do not understand).
These types of situations are common, and are toxic to development teams. No amount of cool toys, amazing salaries, great perks and flexibility will make up for a crappy management situation. Developers tend to be a smart bunch, and often they know the values of their skills. Developers are also often introverts, and would rather simply move on than fix whatever is wrong.
The difficulty in managing a development team is matched only by the amazing productivity one gets from a gelled, fully functional development team. It is worth it, and companies that grasp this will be successful.
Technical people should be led by technical managers.
There’s a famous saying that “those who can, do; those who cannot, manage.” This seems to be sadly true in the technical world, where technical leads are not experts in the fields they manage. This is deadly to a development team.
It’s true that development and management require different skill sets; developers like to have all the pieces and struggle without them, while managers typically have th
Truncated by Planet PHP, read more at the original (another 3270 bytes)
In software development, it’s crucial to track bugs and new features, and to be able to know exactly where a project is at any given moment. Bug tracking is crucial tot his goal; it allows a project manager to know what has been finished and what still must be done, as well as to outline to each developer their goals and responsibilities.
Most developers agree on the importance of bug tracking. Here are five tips I use when utilizing bug tracking.
Institute bug tracking for the development team, even if management won’t.
Occasionally I’ve been on a team where the management team doesn’t understand the point of bug tracking. However, upper management need not understand in order for the tracking to be effective. Bug tracking is, for the most part, a developer’s tool, to track errors, add features, and improve the development process.
There are lots of open source bug tracking packages that are free and easy to use. They can be set up on any server, and usually don’t require special setups.
Create bugs if other teams won’t.
It’s possible that other teams may also not understand the purpose or value of bug tracking, and will refuse to file bugs in the system. This is especially prevalent if the upper management team doesn’t support bug tracking.
It’s tempting to struggle against other teams to convince them to use the bug tracking system. This will almost always fail. Instead, it’s often easier to institute a policy amongst the development team to convert service requests into tickets yourself. As a team you may also consider filing the tickets with the requester as the person watching the ticket. After a few closed tickets, other teams may see the value and begin filing tickets.
Track every bug, big and small.
There are two reasons why you want to track each and every issue in a bug tracking database: first, because it’s impossible to hold all the information in your head, and second, because it creates some very good documentation of the project over time.
Joel Spolsky does a fabulous job explaining how remembering bugs gets more difficult as time goes on. However, the value of the documentation created by bug tracking is often overlooked. When bugs are filed, and fixed, and time goes on, it’s easy to generate statistics on who creates, who fixes, and what fixes are applied in the code base. New bugs can be linked to old bugs (to see regressions), and all the problems that have ever been solved are in one place.
For this, it is crucial that whatever bug system you use has awesome search capabilities. It’s also important that developers write detailed notes on their solutions, or link to repository commits so that other developers can see the changes made.
Track bugs and features together.
While it’s known as a “bug database” it should really be an “issue database” because it should record everything, bugs and features together. Why? Because it makes it easier to know what work is on the table before getting started on a particular project.
Having both bugs and features in the same place means that the project manager as well as the developers can see the tasks on hand, as well as determine what should be pushed into the next release. Often times when considering a new release, project managers only consider new features, because they don’t see the amount of work that must be done to fix the bugs. Having them together will help ensure that this doesn’t happen.
Fix old bugs before starting on new features.
The link to Joel’s post above outlines some of the benefits of starting on bugs before working on new features. I won’t rehash his points; what I will add is this: fixing old bugs first is crucial because developers hate doing it. Developers like working on new code. Developers hate going back and working on old code that they’ve already written, or that other people have written. By forcing developers to fix outstanding bugs before they work on the “latest and greatest” new features means that the bugs will actually get fixed, rather than being pushed off and pushed off until they’re all that’s left before shipping the product.
I took a few days off during the holidays to think about what I wanted to accomplish during 2010. 2009 was a great year, with lots of accomplishments: this blog hit 20,000 unique visitors in a month, was published in a lot of different places, and, according to lots of different sources, put together some really great content.
But my 2010 goals are about more than just this blog. They’re about my development as a PHP developer, as well as my personal life. Here are my goals for the 2010 year:
What are your goals for 2010?
If you ask most developers about source control, they’ll agree that it’s a wise thing to use. They’ll insist that they think it’s important. But yet, why are so many companies out there still not using source control in their projects? A good number of companies that I’ve worked with failed to make use of source control, resulting in issues that would have been trivial otherwise. In this article we’ll explore ways to make sure that if your company isn’t using source control, that you can help make a change to this policy.
Source control doesn’t need to come from the top
The first oft-considered misconception is that source control must be endorsed by upper management in order for developers to use it effectively. This is 100% incorrect. There are a number of ways that developers can make use of source control, even if management fails to embrace it, or rejects it altogether.
Subversion can be set up locally on most systems; a repository can be created in a central location and, as long as you do a regular backup, your repository should be secure. Git is designed to have the repository locally by default; this makes a great source control system if you have total ownership of the project.
Bear in mind that the point of version control is two-fold: first, to make it easy to collaborate with a larger group, and second, to help version your code so that you can see changes over time and possibly roll back if there are problems. Bear in mind that in larger teams, personal repositories won’t work, but in smaller teams or in places where you own your project, you can use source control.
The person that needs convincing is probably not the upper management at all
Many of my friends work for “creative firms” that are marketing first, development last. These firms are set up to provide excellent marketing to the clients, but more or less may not care about the development team. While this is often frustrating, it can also be a blessing in some ways.
As long as the team lead can be brought on board, placing code into a repository can be easy. If the team is larger than three or four people, using Subversion or another VCS is crucial; more likely than not there will be other owners of the project and everyone will need to collaborate.
The cool thing is that most of the time upper management types don’t care how the development is done as long as it is in fact done by the developers. Thus, convincing upper management isn’t the way to accomplish the goal of implementing version control.
Version control is not a silver bullet
A few times in my career I’ve heard someone say something along the lines of “oh, if only we had been able to use version control. Then that project would have been on time.” Bzzt! Wrong.
It might seem obvious that version control is not a silver bullet but it is often treated as such. It’s such an important and ubiquitous tool that it’s easy to think that having it will solve all development problems. But it’s not a replacement for good management, good time management skills, and solid development practices.
Version control should never be cited as a way to make development faster. It doesn’t generally do this (with the exception of generally preventing developers from accidentally overwriting someone else’s work, causing duplication of efforts).
Version control is best utilized before a crisis
Again, this might seem obvious, but there’s a harsh reality amongst developers: most of the time, when they feel as though they’re not permitted to implement version control, have a tendency to say “Well, next time code gets lost, that’ll teach ‘em.”
Don’t think like this! It’s just going to ultimately create more work for you, the developer, when the management wises up, realizes that they need version control, AND that they need the code rewritten that was ultimately lost. You’ll have double the work, and even though you’ll get what you want, you won’t like it.
Fight for version control consistently BEFORE a disaster. This will improve your chances of having it in place prior to some disaster striking.
In the time that I have developed software, I don’t know that I’ve ever met a developer who got excited about writing specs for anything. In fact, most developers loathe writing specs, or developing schedules of any kind. It’s not that they’re lazy, or that they don’t want to be held accountable; most of the time it’s because developers prefer to express themselves via code, or because developers are afraid that if they set a schedule, and then reality doesn’t match up, they’ll be forced to produce sub-standard code. Neither of these is an ideal situation.
This is directly at odds with the business need of specifications and schedules. Businesses need schedules to know when products will be finished and schedule things like trade shows, product launches, and write contracts with clients who need or want a particular product. It’s not as if businesses want to push their developers to insanity by forcing them to schedule and then stick to it; more often than not thousands of dollars hinges on the schedule, and it simply must be met.
Schedules and specs are a core component of software development, and business development; so much so that Joel Spolsky included developing both as core components of the Joel Test. While developers hate writing specs and developing schedules, there are some painless steps they can take to create them.
Specs need not be complete documents
Lots of times, specs need not be complete documents that contain specific, detailed information. In fact, lots of times specs need not contain a single complete sentence to be effective. Specs are, in their most basic form, a description of the way that something should work. A wire frame, if done properly, is an adequate spec. The workflow of an application together with markers, is a spec. My specs are simply drawings of the view that I want together with some notes, arrows, and perhaps some writing on the back; this works well because it describes the way the application should look, and the rest of the spec contains database diagrams and other views to enhance understanding.
Writing a spec doesn’t actually mean opening up Microsoft Word each and every time. And the client shouldn’t be expected to generate a document like that, either (see the third section below). Specs can be representations of functional apps.
Specs are not documentation
There seems to be another push in many groups to be able to use the spec as documentation later on. Don’t do this! Specs are not documentation. They are descriptions of how the thing should work, not diagrams for how it does work.
Actually implementing the details of the spec may result in a disconnect between the implementation specified and the implementation completed. This is normal; programmers often discover logical mistakes in specs when they actually go to implement them, and must make adjustments. This occurs in every field; engineers, for example, must sometimes adjust their underground services to route around unmarked power or sewer lines. Deviation from the spec is normal, as long as the end product looks like the spec.
However, if the spec is meant to be the documentation, and there is no time in the schedule for writing documentation, there’s intense pressure to make the spec and the implementation look identical. This leads to unusable or difficult software, or worse, bugs. Spend time to write documentation, rather than relying on the spec to do double duty.
The development of the spec should be done by the developers, not the client
There’s an old customer service adage: The customer is always right! This is not true, however, when it comes to the details in the spec.
Consider: when you hire an architect, you tell him that you want a house. You do not, however, hand him the blueprints. Instead, he draws up plans and works with you to develop the blueprints for your house. Handing him a set of blueprints and telling him to build the house would eliminate the need to have the architect in the first place! The same is true for the development of specifications.
Developers should work with the clients to make sure that the needs of the client (”I need an invoicing system” or “I need a new website with a blog”) are translated into technical specs (”We need to use Zend_Pdf with Symfony” or “Drupal is the best option here”).
Truncated by Planet PHP, read more at the original (another 4990 bytes)
Last week, Aaron Brazell posted a blog entry about the state of the Wordpress and PHP communities. At the same time, Keith Casey was in Redmond, Washington, where he was experiencing the Microsoft Web Developer’s Conference. As so often seems to happen with “Aha!” moments, both men came to pretty much the same realization at the same time: the Wordpress and PHP communities need each other, but don’t do nearly enough to work with each other.
Keith made his point clear when I explained to him that I agreed with what Aaron was saying in his blog post, but that Wordpress supporting PHP 4 was Wordpress’ “fatal flaw.” In his…articulate way…he reminded me that Wordpress existed and flourished, in spite of our attempts to attack their support for PHP 4. Their use of PHP 4 was certainly not a fatal flaw, as much as our arrogance as a community seems to be.
The one thing I hate about Keith is that he has a tendency to be right. And he was right this time.
The reality is that you can’t separate the success of PHP from the success of other open source tools like Wordpress and Drupal. As they have flourished, so has PHP; the success of each is interconnected with the success of the others. Yet it seems that the PHP community is fond of attacking and demonizing these communities, which only serves to drive them away from the PHP language.
Wordpress and Drupal could exist independently of PHP, if PHP didn’t exist as a language. The need was there; they would have been written in Python or Ruby on Rails, or Fortran; the language is irrelevant; the need was paramount. The fact that they were written in PHP is an accident of history, not a divine destiny. They were written in a time before PHP was widely accepted. And they were written with the tools available at that time.
In short, we, as the PHP community, need to lay off. And we need to put up or shut up. Hate the fact that Wordpress still supports that “dead language” of PHP 4? Write a patch. Hate that Drupal’s UI would make a geek cry out in agony? Write a patch. These are open source projects; we each can look at the source and make them better. But we should never, ever attack these projects for not conforming to our standards. Especially if we’re not improving them.
And we need to be better about interfacing with the communities that surround these projects. Each has a robust community that doesn’t identify with PHP. They don’t understand PHP, and the reasons it works the way it does. Many of them blame PHP for the troubles of the product. We need to be proactive about reaching out to these groups, and interacting with them. We need to be receptive to their needs. And we need to work with them to improve PHP overall.
A lot of time and effort goes into designing processes for development projects when the projects are professional or work-related projects. We spend hours investing in version control, bug tracking, specification design, and process.
But what about our own personal projects, that we do either for money or for fun? Too often, it seems like these development practices are abandoned, especially with regards to the use of a bug tracker. I know I have personally been guilty of failing to use a bug tracker, even though I use things like Subversion and develop specifications. It’s easy to forget, but important to remember. Here are five reasons why our personal projects should utilize a bug tracker.
These are personal anecdotes from coming back into the “bug tracking fold” after being away for a while.
1. Our minds are imperfect repositories of information.
As we work through personal projects, there’s often a much more intimate relationship between ourselves and the product. This often leads to the feeling that bugs and necessary features no longer need to be logged in a bug tracker because we’ll remember them. The reality of the situation is that we cannot remember everything we need to remember. Details will be forgotten, unless we use some sort of a system to remember them later on.
2. Bad development practices can form.
When we opt not to use a bug tracker in our own personal projects, we get into the habit of writing code without one. This is a poor habit to get into, because it is one that will affect us when we go to our jobs. The truth is that habits are easy to form, and conflicting habits will always be resolved to the most lazy habit, because we’re human.
3. It makes it harder to force ourselves to use bug trackers for private paying clients.
Most developers I know have a few private clients that they work with who are not connected to their employer. It is crucial that we employ the same level of process and use the same development practices with our private clients as our employer’s clients, because these clients are still paying for our services and deserve our best work. The problem is that if we have let ourselves slip on the personal projects, we will find it increasingly hard to keep bugs filed for private clients. This will lead to a lower quality product, which is less than optimal.
4. Predicting time to completion becomes more difficult.
When you’re at work, it’s easy to plug on and complete a project – it’s what you’re paid to do. But at home, for personal projects, it’s much harder to plug along without some assurance that the project is nearing completion. Having a task tracker lets you see the things that have been done, making personal projects easier to manage and making it easier to see how long it will take to finish something (which is a reward in itself).
5. Seeing progress being made is that much more difficult.
Along with predicting when it will be finished, using a bug tracker also makes it possible to see how far along you are in a project. I know that when I create tickets for everything in a project, I can easily feel good that I closed six or seven of them; I feel like I’m making progress and I have statistics to show for it.
Last month was a record month, fueled by a front-page story at Reddit. This blog had 27,663 unique visitors with 52,343 visits and 125,164 page views. That’s an astounding amount of support from the PHP community and the programming community at large. Thank you. I’m overwhelmed, and excited that this blog has the ability to reach large numbers of people. I enjoy writing, and I enjoy discussing programming best practices, as well as talking about the things that matter to PHPers around the globe.
With this infusion of readers, a number of issues have begun to surface related to the comments, and particularly, the appropriateness of some users’ comments. While many of these comments are left in moderation, and never reach the end user, I believe it is crucial that each and every comment be beneficial, useful, and appropriate. Thus, it is necessary to lay down the criteria I will use to judge the worthiness of a comment before it is approved, or the worth of a comment if it is posted without moderation.
This is in no way an attempt to censor the comments posted here; they are a crucial component of the discussion. It is important to remember that there are no rights when it comes to commenting on a private blog; however, I am committed to honest discussion and debate. This code is designed to encourage that, while clarifying the boundaries of what is acceptable.
1. This blog is about PHP.
Each post here, for the most part, is about PHP (with few exceptions). It is my goal to provide high-quality content about developing PHP, and each post is designed to do just that.
Thus, it is not really alright for people to post comments along the lines of “PHP sucks for this” or comments about why PHP is a poor language choice for whatever it is that I’m writing about. It is alright to discuss how it is done in other languages – provided the discussion enhances everyone’s understanding of the points raised in the blog post. This blog is an inappropriate place to debate whether or not PHP is better than Rails or Python; that discussion has its place elsewhere.
2. This blog is in English.
I’m proud of the fact that readers from around the world come here to read the entries that are posted. I recognize that while PHP may be language-agnostic, others may have a preferred language, and may not be the most competent in English. That said, because this blog is in English, the comments must also be in English. Comments in other languages are not helpful to those that do speak English (which is the vast majority of visitors here).
I have already licensed my content for reproduction in other languages, and you are free to translate it into your language and publish it for discussion in that language. But comments made here must be in English.
For that matter, links posted must also be in English, as must names (no Chinese characters, please; they’re often captured by the spam filter anyway).
3. The anonymous comments feature is disabled for a reason.
I have staked my reputation to what I write here. My reputation lives or dies by my accuracy, my poise, and my ability to generate excellent content three times a week. When I write something, I stand by it; this is because I have to, not always because I want to (and sometimes I write things that are incorrect and have to eat some crow).
With that said, I think it creates a double standard if people try and comment without being linked to their words. Since most comments try and add value to the discussion, it’s important for people to be able to judge credibility. Anonymous comments don’t allow people to determine this credibility.
If I can’t identify you, then your comment stands almost no chance of being approved. This means that if your email address is clearly a forgery, or your name is “Anonymous” or “Won’t Say” or something like that, I’ll probably reject or remove the comment. Credibility and accountability are crucial in the discussion, and I am in favor of both.
4. Discussions are not personal.
Attacking others is just wrong. Please don’t do it. This rarely happens; however, if it does happen, the offending person will have to be banned. The comments are a place to ask questions, and if someone proposes something inaccurate or incorrect, it is acceptable to challenge them. But it’s not acceptable to attack them.
Comments aren’t the place for reporting grammatical or coding errors
Occasionally I make mistakes. When you discover one, use the contact page to let me know, rather than posting it in the comments. The reason? Because once the mistake is fixed, your comment is irrelevant and incorrect; this isn’t beneficial to anyone.
Summary
I’ll probably update this as time goes on, and I look forward to the discussions that will happen in the coming months. There are a lot of exciting posts coming up soon, and I look forward to discussing them. These points will help keep discussions civil, reasonable, and informative for those who come later.
It’s happened to each and every one of us: we fill out a long form, complete with username and password. We double and triple check everything, because want to make sure the submission works. We verify our email address, our date of birth, and even maybe retype our password, just to make sure they’re both right and they both match. And then we fill out the CAPTCHA, with so much care (passing those things is still random, whether you’re a human or not). And then we hit submit.
And we wait. Breathless.
What happens next? Well, our form pops up back in our face, with a bunch of red over it (if we’re lucky) saying that we’ve done something wrong. Seems that we entered our phone number as “1234567890″ instead of “123-456-7890″ or we entered our date of birth as “6/8/1995″ instead of “1995-06-08″.
Oh, and our password is gone (because that’s how password fields work) and the CAPTCHA we beat…we have to beat it again.
Why on earth has this happened? The simple answer is that whomever designed the form decided to place the validation of the data, and its massaging into the proper format, onto the end user. But there’s a more complicated issue at hand here: the fact that the developer either felt it wasn’t his responsibility to do the data formatting, or didn’t realize that not everyone would think to place dashes or format dates the way he does.
The sad thing is that data formatting is both easy and often overlooked. Developers in a hurry will place the data formatting obligations onto the end user, rather than writing code to do it themselves, when really the code is so very simple. Take, for example, a small function that formats a phone number properly:
function formatPhoneNumber($number)
{
$phone = filter_var($number, FILTER_SANITIZE_NUMBER_INT);
$phone = str_replace('-', '', $phone); // We want to put our own dashes in the right place
$phone = str_replace('+', '', $phone); // + is a numeric character but doesn't belong in a phone number
//It's possible that this number is a foreign number.
if(strlen($number) < 10 || strlen($number) > 10)
{
return $number;
}
$areaCode = substr($number, 0, 3);
$prefix = substr($number, 3, 3);
$lastFour = substr($number, 6);
$number = $areaCode . '.' . $prefix . '.' . $lastFour;
return $number;
}
Admittedly, the above function isn’t perfect; it certainly doesn’t validate the format for each and every number and in fact doesn’t test to ensure that the phone number contains anything more than a single number. But it is a step in the right direction. With a little bit more work we could make it fully functional.
PHP also includes a number of date options, one of my favorites being the strtotime() function. strtotime() takes almost any date and converts it into a Unix timestamp. It even takes arguments like “today” and “tomorrow” and “next Wednesday” and gives a Unix timestamp for those values. PHP 5.3 has improved DateTime object handling, meaning that any value you give to strtotime() can be given to the constructor of the DateTime object, and you can then manipulate that object to format the date in any way you see fit.
Each of these solutions would prevent the user from having to format the data for us, unless the user failed to enter the appropriate data. But in 90% of the cases, the form would submit properly on the first try if it was filled out correctly and the users would be much happier for it.
The bottom line here is that data formatting is our job as developers. We have an obligation to make the forms we create as easy to use as we can; it is not the job of the end user to send us data that matches the format of what we want. We should be prepared to format the data ourselves, and ask the user for help only when the data they submit is incompatible with what we’re trying to accomplish. This is the point of feedback; use it wisely.
Thanks to Marco Tabini for reminding me that this was something I had needed to write about for a long time.
Trac. CruiseControl. phpUnderControl. Jira. Bugzilla. These are all intensely popular development tools. And not a single one of them is written in PHP.
Why?
Trac is written in Python. CruiseControl is written in Java, and phpUnderControl is built on top of CruiseControl. Jira is written in Java and is a commercial program. Bugzilla is written in Perl. All of these programs have either been around for a long time, or they have commercial components attached to them.
Some might argue that PHP is a lesser language, and thus incapable of producing the results that Python and Java can produce. Others might argue that other languages are more mature. But the truth is that these applications don’t exist in PHP simply because PHP wasn’t previously capable of producing them. With PHP 5’s object model, PHP is finally able to produce the high-quality applications that developers can use. PHPUnit is a perfect example of that.
Next year, I’m devoting myself to writing and developing open source applications for the PHP community to use. These applications will consist of a continuous integration server, a bug tracker (that doesn’t suck or look terrible), and an SVN browser. These applications will be available for free, and will be community driven. I’m taking the initiative because these applications are great ideas but few people have the time; thus, I’m dedicating myself to writing them and making them available for others.
This will be my big contribution to the PHP community in the next year. I think it’s worthwhile and necessary. And it should be a lot of fun, too.
As Benjamin Franklin once famously said, “the only two things that are certain in life are death and taxes.” His point, while political, has a good perspective on one of life’s ever-persistent truths: the fact that governments exist in every country, and, largely, they have some of the same benefits and drawbacks everywhere.
However, the ubiquity of governments around the world also gives us a unique opportunity to learn some lessons from them as developers, particularly about principles of object oriented programming. Governments serve as perfet object lessons (pun intended), demonstrating some of the good, the bad, and the ugly object-oriented practices we see.
First a warning: because the purpose of this post (and this blog) is not to discuss politics, comments about politics won’t be approved. Each comment will be moderated.
In reality, government serves as one big object-oriented application in many ways. Government excels in the areas of abstraction, encapsulation, the implementation of dumb objects, decoupling and giving one class a single responsibility. Let’s examine each of these as programming practices and see what we can learn.
Abstraction
Governments are really good at abstraction – almost too good, sometimes. Abstraction is the principle that you can take complexity and break it down internally to ensure that the appropriate object, with the appropriate inheritance and components, does the appropriate task.
Governments do this largely by creating a bureaucracy. Programmers do this by creating parent and sub- classes that have specific responsibilities. Object-oriented applications are, by default, bureaucratic – when abstraction comes into play, you give objects specific responsibilities and abstract out the peripheral things that might otherwise be included when doing functional programming. However, it works well in object-oriented programming because objects follow specific rules absolutely – meaning your “bureaucracy” doesn’t make mistakes along the way.
Encapsulation
This is one area of object-oriented programming that we don’t discuss often enough (I’m not sure I’ve ever mentioned it). Encapsulation is the practice of concealing the inner-workings of an object from the objects that pass messages to it.
Governments do this by creating front-line and back-line workers. Take, for example, the courts: you have clerks, judges, and others who deal with the “customers” – criminals, lawyers, plantiffs, etc. But each court system has hundreds of other workers who file, copy, collate, prepare dockets, update websites, and do other functions that are behind the scenes but are crucial to the operation of the object (the court).
As developers, we can do this by shielding internal functions with the protected and private keywords (in PHP, anyway). These keywords make it impossible for those outside our objects to mess with the inner workings, but still allow those innerworkings to proceed. We have our “front line” (our public API) and our “back line” (our protected methods) which do other crucial, internal work.
Dumb Objects
Lots and lots of people complain about how each department they work with independently requires verification of the same documents before they can proceed. For example, in applying for the proper permits for my business, I had to submit proof that my LLC was established and in good standing before I could get a Home Occupancy Permit, and I’ll have to show the Home Occupancy Permit paperwork before I can get a business license. And this is the same department of Washington’s government – the Department of Consumer and Regulatory Affairs. You’d think that they could work together and share the same database, right?
Not so fast! The government has done something that more programmers ought to do: it’s created “dumb objects.” This is not to say that the workers who do the work are dumb; instead, the departments themselves are dependent on you giving them (dependency injection) the proper documentation.
People have debated for a long time on whether or not this is appropriate behavior for government, but it the undisput
Truncated by Planet PHP, read more at the original (another 3387 bytes)
Since being laid off last month, I’ve thought long and hard about what I wanted to do next. After consulting with my fiancee and with friends and colleagues, I decided that the best approach would be to begin working towards my own consulting company. And so, several weeks into the process, I’ve laid the groundwork and today, I can announce it.
Blueprint DC, a full-service custom software development company based from Washington, DC, is officially open for business.
After being a PHP developer for five years, it feels awesome to be able to go into business, and work on great projects. I look forward to the awesome relationships I’ll establish and the great work I’ll be able to do for a multitude of clients, instead of just one or two (my employers).
So feel free to browse the site and get in touch with me to meet your PHP development needs. I look forward to working with you!
In the United States, we take the fourth Thursday in November to give thanks for all the things we appreciate, and all the people who matter. I know that without PHP, my life wouldn’t be nearly the same; the language I use every day plays a critical role in everything else I do: it provides money to pay bills, a sense of accomplishment, and the joy of working in a language that is truly fantastic.
We owe lots and lots of people for the PHP language, people who are not often thanked nearly enough. Thus, I have decided to pick five people that have contributed significantly in the last year, and (in no particular order), discuss their accomplishments and thank them each personally. I realize that PHP is an effort of several hundred individuals, and that each one deserves credit; there simply wouldn’t be the space to thank them here personally, but each contributor should know that I personally appreciate each and every one of you.
So, without further ado, thank you to:
Felipe Pena
Described as the “unsung hero of PHP”, Felipe Pena is one of the most important developers of PHP that no one has ever heard of. He goes around PHP fixing all sorts of bugs, as well as working on the new features, and even fixing bugs in PECL packages when asked. He told me that he got started by adding a patch and was hooked – something that others can learn from regarding their ability to contribute.
Scott MacVicar
As the maintainer of the Mac OS X package of PHP, Scott is all too familiar with the problems Apple hands developers in their development enviornment. Scott is an active participant in trying to make PHP run better on the Mac, and has contributed to various extensions, including JSON, Fileinfo, SQLite3 and PDO_SQLite. He bears a large amount of responsibility, as well as being heavily involved in PHP London as its President.
Elizabeth Smith
The PHP language works exceptionally well on Windows, and much of this is due to the work of Elizabeth. She has done a considerable amount of work getting PHP to work on Windows. She tests everything (extensions, frameworks, core builds) on Windows and regularly reports bugs when she finds them. She also maintains PHP-GTK, as well as the Cairo package in PECL.
Derick Rethans
Derick is an expert in all things date and time related. Derick is the author of the authoritative book on dates and times, Guide to Date and Time Programming. He is also the author of Xdebug, which is the single most important extension in the PHP language (install it! You will thank me). Derick deserves praise for his involvement and versatility in the PHP community, and is one of the best resources in the community.
Ilia Alshanetsky
Ilia deserves praise as a regular bug-fixer, as well as being partially or completely responsible for such extensions as SQLite, PDO, GD, and others. He’s served as release manager for several releases as well, ultimately being responsible for rolling the release and getting it out the door for PHP users to utilize.
One of the most hotly contested points of my article on database design was the suggestion that developers drop the use of ENUM and use something else instead. Lots of people argued in favor of ENUM; however, there are several good reasons why developers should reconsider ENUM and use it sparingly.
There are three core reasons why ENUM is a data type that should be reconsidered.
ENUM requires a rebuild of the table when adding a value to the middle of the set.
While its true that adding an ENUM to the end of the set doesn’t require a rebuild, often this is impractical. Adding a new value to the ENUM definition will require MySQL to rebuild the entire table – less than optimal for large tables. The time required will depend on how large your tables are but millions of rows will take considerable amounts of time.
ENUM values are ordered in the order they’re added to the database
If you’ve ever done an ORDER BY on an ENUM column, you’ll notice that MySQL organizes them via the order they were added to the ENUM, rather than alphabetically or numerically. This ties into the first point, because if you want to order values in, say, alphabetical order, you have to reorder the ENUM in alphabetical order, and this results in a rebuilding of the table by MySQL.
It’s also worth nothing that for developers on your team, they may be extremely frustrated when they discover that the column is ordered “incorrectly”; they might expect it to be ordered alphabetically and since it’s not, they will try and figure out why. They may not know the database as thoroughly as the database administrator or developer who put it together, and thus might not know about the ENUM fields.
ENUM values do in the database what should be done in the model.
Contrary to what many people believe, the database is not the model. The model is the domain logic which takes the raw data and turns it into data that the application uses. For example, you may store 0.25 in the database but convert that to 25% when you display it in your view; it’s not stored as 25% in the database, though.
The model should be enforcing the constraints on the data types and values going into your database, not the database itself. The database is simply the storage location for the data the model needs. The same logic tends to apply for triggers and stored procedures, limiting their authorship to manipulating data in the database when doing so via the model would be too time-consuming or resource intensive (the database is generally going to be faster at manipulating database data).
I’ve seen ORMs that translate ENUM and SET columns into VARCHAR columns automatically for you. According to Rya Martinsen, Doctrine is one such ORM.
Cases For Using ENUM
When my last blog post published, one of my good friends Eli White pointed out that ENUM could be useful for columns that had data that would always fit into one particular set of values. For example, he said gender would be one such field that might be served well with an ENUM. And if anyone should know about database types, it’s Eli, since he’s a former Digg employee.
Summary
The bottom line is that ENUM has its place, but should be used sparingly. The model should enforce the constraints, not the database; the model should handle interpreting raw data into useful information for your views, not the database.
This morning, I was reviewing the weekly list of topics with the most comments throughout the PHP manual, and I stumbled upon the following code in the documentation for the date() function. This code is designed to tell you the day of the week for any valid date you give it:
<?php
function return_day_of_week($date){
$sy=substr($date, 0, 4);
$sm=substr($date, 5, 2);
$sd=substr($date, 8, 2);
$date_utc=mktime(0,0,0,$sm, $sd, $sy);
$today_utc=mktime(0,0,0,date("m"), date("d"), date("Y"));
if($date_utc>$today_utc){
$future_date=1;
$temp=$date_utc;
$date_utc=$today_utc;
$today_utc=$temp;
}
$utc_difference=$today_utc-$date_utc;
$weeks_count=($utc_difference)/604800;
if($weeks_count<10)
$weeks_count=substr($weeks_count, 0, 1);
else if($weeks_count<100)
$weeks_count=substr($weeks_count, 0, 2);
else if($weeks_count<1000)
$weeks_count=substr($weeks_count, 0, 3);
$days_rest_count=substr(($utc_difference-$weeks_count*604800)/86400, 0, 1);
$was_day_of_week=date("w")-$days_rest_count;
if($was_day_of_week < 0){
if($future_date==1)
$was_day_of_week=0-$was_day_of_week;
else
$was_day_of_week+=7;
}
return $was_day_of_week;
}
?>
I’m sure that this user put a lot of work and effort into this function. I’m sure they were excited to share it with the PHP community. I’m sure they thought they had stumbled on a solution to a problem that everyone needed to solve. I’m sure they had no idea that this code is a great example of using PHP precisely the wrong way.
The example wouldn’t be so bad, except for the fact that it’s a user-contributed note. This means that PHP developers from all over the world (at least those that speak English and read the English version of the PHP manual) might see and, God forbid, implement this function.
There are a number of ways to rewrite this function that would make it shorter and more efficient. For example:
<?php
function getDayOfWeek($date)
{
return date('w', strtotime($date));
}
This example doesn’t have any error or sanity checking, of course (an invalid value will result in a default date being returned). However, it’s considerably shorter than the function in the notes, and does exactly the same thing.
My point is not that we should shun community involvement in making the manual documentation better; surely we should permit user generated comments to continue. My point here is a cautionary tale for others to follow, a reminder that user-generated notes are unmoderated, and may be incorrect, buggy, or just plain wrong.
This situation is not unique to PHP; user-generated content has always had the potential for flaws (see the many cases of Wikipedia vandals making the headlines). In programming, flaws can be dangerous, and so, developers should be wary, but not ignorant, of the user-contributed notes in documentation like the PHP manual.
When I started writing this blog post, I had titled it “Tips for Designing Databases” and I planned to talk about various database design techniques. However, as I did more and more research, it dawned on me that one of the most crucial, and most overlooked, components of database development, is the selection of data types for columns.
Much of the information presented in this article was taken from presentations by Jay Pipes and a talk by Ronald Bradford. The talks are The Top 20 Design Tips For MySQL Enterprise Data Architects, Join-fu: The Art of SQL Tuning and SQL Query Tuning: The Legend Of Drunken Query Master.
MySQL supports a large number of data types (with Postgres supporting even more). For example, MySQL supports some 10 different numeric data types (INTEGER, TINYINT, SMALLINT, MEDIUMINT, BIGINT, DECIMAL, NUMERIC, FLOAT, REAL, and DOUBLE PRECISION), meaning that database designers need to know and understand how to use each one of them properly. Using them improperly adds stress to the database and generally reflects bad database design.
Since it would be impossible to discuss every data type in a blog post, I will instead discuss some of the most common MySQL (and this applies to other database platforms as well) mistakes, as highlighted by the presentations and blog posts I will cite.
Understanding how disk I/O affects databases
Databases are typically stored on disk (with the exception of some, like MEMORY databases, which are stored in memory). This means that in order for the database to fetch information for you, it must read that information off the disk and turn it into a results set that you can use. Disk I/O is extremely slow, especially in comparison to other forms of data storage (like memory).
When your database grows to be large, the read time begins to take longer and longer. This is a natural occurrence as the engine must read over more and more of the disk in order to find the information you have requested. Poorly designed databases exacerbate this problem by allocating more space on the disk than they need; this means that the database occupies space on the disk that is being used inefficiently.
Picking the right data types can help by ensuring that the data we are storing makes the database as small as possible. We do this by selecting only the data types we need, rather than choosing data types willy-nilly. This helps reduce the size of our rows, and by extension, our database, making reads and writes faster and more efficient.
Picking the right numeric type
The type of integer you select affects the amount of space that integer occupies on the disk – regardless of the value of the number you actually store in it.
For example, a BIGINT occupies 8 bytes of space, while a TINYINT occupies 1 byte of space. While the 8 byte integer gives you the ability to store huge numbers, it also means that you must store all eight bytes every time you store a record into that table. If you’re storing numbers like 2,000 or 5,000, you’re wasting lots and lots of bytes. This will inevitably make your reads slower, because the database must read over multiple sectors of the disk.
Also, many people assume that they must pick a larger integer size because the integer size they picked might not allow for enough values. For example, a SMALLINT will allow you to store up to 32,767 – if you leave it signed. Leaving it signed leaves the very last bit as a determination of whether or not the value is a positive or negative value. But if you declare the value to be UNSIGNED, you make that bit available for use, allowing a SMALLINT to store up to 65,535 – twice what it can store as a signed integer. Your primary keys should always be unsigned, especially if they are auto-incremented; MySQL will never assign a negative value as a primary key.
Truncated by Planet PHP, read more at the original (another 4632 bytes)
Last week, I received an email from someone who told me how the Suhosin patch had created problems for their team, and suggested that I write about it here. I thought this was a great idea, for a number of reasons. Particularly, Suhosin is one of those PHP patches that alters the way PHP operates in a fundamental fashion, yet also is installed by default in many places (for example, Ubuntu compiles this patch in by default on their installation).
For starters, what is Suhosin? Suhosin is a PHP patch that “hardens” PHP’s security features. The makers of Suhosin describe it in this way:
Suhosin is an advanced protection system for PHP installations. It was designed to protect servers and users from known and unknown flaws in PHP applications and the PHP core. Suhosin comes in two independent parts, that can be used separately or in combination. The first part is a small patch against the PHP core, that implements a few low-level protections against bufferoverflows or format string vulnerabilities and the second part is a powerful PHP extension that implements all the other protections.
So how does Suhosin affect you? Suhosin can affect you because it fundamentally alters the way PHP operates. Here are some of the features and “gotchas” that you should watch out for:
Allows the disabling of eval()
If your application uses eval() for any reason, and you deploy it to a remote server hosted by someone else, there’s a chance that they may have disabled eval() which would break your application.
I have no intention of defending eval(); I don’t use it, and I’m not going to make statements on whether or not you should. However, if you have a legitimate use, you must be careful to make sure that eval() is not disabled.
Disallowing of Remote URL Inclusion
While this is generally a poor programming practice to begin with, Suhosin disables your ability to include remote URLs. For exmaple:
<?php require 'http://www.anothersite.com/';
This will fail with Suhosin installed and activated. While this is a horribly dangerous programming practice in the first place (you should use file_get_contents() instead), it might generate problems for your application if you are unaware that Suhosin is installed.
Changes scripts ability to modify the memory_limit
Occasionally, on the fly, I’ve changed the memory limit on one script (a cron job, for example) in order to prevent the script from failing. This value can be set throughout PHP; however, Suhosin changes this behavior and does not allow you to change the memory limit on the fly. This can create problems if you expect/need the memory limit to be alterable.
Allows limits on length of REQUEST arrays
If you have a particularly long form, you may run into this problem: Suhosin allows you to limit the length of the REQUEST array, thus limiting how long your form is. While you may never run into this, you should be aware of the possibility that Suhosin might be responsible for this.
Super-long arrays can create problems in PHP, and attackers might attempt to add millions of form fields with the hopes of generating an error or somehow affecting your application. While this protection can be good, you should be aware of its ability to adjust and affect your application as well.
So is Suhosin bad?
Absolutely not. Suhosin does a number of good things, and helps prevent against a number of possible attacks and vulnerabilities in PHP. That being said, Suhosin is not a replacement for good coding practices. Its installation on major servers is largely due to the fact that server owners wish to configure components of PHP that are not otherwise configurable due to the way PHP is configured. It is therefore their right to install this patch and configure it any way they like.
Suhosin is by no means a requirement for PHP development. You can, and should, learn the PHP best practices so that patches like Suhosin are merely an aid, not a crutch. Still, because Suhosin is installed by default as a part of many PHP installations (this server uses Suhosin), you should be aware of it’s ability to act as a little bit of an “invisible hand” throughout the PHP world, guiding your security choices before you even have the chance to make them.
How do I make sure my application is compatible with Suhosin if I’m going to use it?
Suhosin includes a compatability mode called "/>
Truncated by Planet PHP, read more at the original (another 566 bytes)
There are a large number of PHPers looking for jobs right now. After having just gone through the process myself, I wanted to put together some of the most common PHP interview questions. These questions are all non-technical, but do represent the soft side of PHP interviewing. I cannot help you if you don’t have the technical skills to answer the technical questions, but answering these questions correctly is often the key to making or breaking your chances with an interviewer who otherwise has fine technical candidates.
Why did you leave your last position?
This question is a hard one to answer, particularly if (like me) your departure was public. However, no employer wants to hear you beat up on another employer. They also don’t want to hear that you “outgrew” an employer (I learned that lesson the hard way). They fear they’re going to be next.
Be honest, but be able to present yourself in a good light. If you lost your job, be careful about how you phrase it. If you left or want to leave your current employer, find a reason that doesn’t make you sound flaky, flighty, or arrogant.
How did you get into PHP?
Work up a good personal anecdote when you prepare to answer this question. This is not a time for the answer of “I wanted to make a lot of money,” even if that’s the true reason. My guess is that it’s in fact not the true reason.
My answer is that I got into PHP because I had a particular problem I needed to solve: I was running an online roleplaying game and I needed to find an automated way to do the calculations involved. I tried Excel but one player didn’t own a copy and wouldn’t have been able to play. I tried ASP and .NET but hated them; PHP worked for me, and that’s how I got into it.
The above answer is the truth, and also has the qualities of being able to show an interviewer that I love the language I’m working in. This isn’t just a job, but a passion. If you’re passionate about PHP, think up a (true) answer that makes them feel good about hiring you.
Where do you ultimately want to be in life?
Contrary to popular belief, the interviewer is not interested in hearing about your dreams. They want to know how well you’re going to fit into their organization.
This is not the time to talk about your desire to start a business or go off to law school in two years (also a lesson from the school of hard knocks). This is a time to talk about how you want to become even better at what you do than you are now, and maybe graduate into a team lead. Keep it focused on the organization at hand. And don’t give too much away.
How many gas stations are there in Los Angeles?
The good news: there is no correct answer to this question so any answer will work. The bad news: this is a question of “how do you think?” without asking that specific question. You’re being graded on your approach to big, abstract questions.
There’s no true way to prepare for these questions, but when you get one, you need to be ready to answer it. Your best bet is to try and reason out an answer with the interviewer; use them as a resource to check your logic and come up with an answer. The answer is unimportant; the thought process is wholly important. So go with that.
These questions are generally rare, but come up in interviews that employ the “Microsoft style” of interviewing. The good news is that if you answer coherently and competently, you should be more than fine.
What’s your salary range?
Never ever answer this question with a number. You’d be surprised at the number of people who think answering this with a number is a good idea. If you answer the question with a number, you can guarantee yourself a salary no higher than that number, and worse, you may even eliminate yourself from consideration by naming a number that is too high.
There are a number of answers to this question. First, there’s the answer that “I am open to considering all offers, and I hadn’t really thought about a salary range.” If that doesn’t work, you can also try and explain that you felt undervalued at your last employer, and that you’d rather not come up with a number right now. Do what you can to be polite but to not give a number to your interviewer, if you can avoid it.
Do you have any questions for us?
You bet your ass you do! You want to find out how well they do on the Joel Test and things like core hours. Ask about dress code. Ask about equipment you’ll be using. Ask about the projects you’ll be working on. Ask to see your wo
Truncated by Planet PHP, read more at the original (another 1188 bytes)
When I lost my job on October 12th, I knew that the PHP community was an invaluable resource for finding a new one. What I didn’t expect, though, was the outpouring of support and assistance I would receive from that community. I learned a lot of lessons from the job search, and thankfully, largely due to the involvement of the PHP community, I have a long-term contract that is both paying the bills and letting me form my own consulting company – a long term dream I’ve had (more on this in a future blog entry).
I wanted to share some of the lessons I’ve learned, because I think that they are important, and they helped me to find a position quickly, effectively, and easily. I actually had more than one opportunity to choose from – something that’s almost unheard of in the current economy.
Network before you need it.
This may be cliche at this point but it’s also the most true and important point: build your network, before you need it. Go to conferences and talk with people. Don’t be a jerk, but instead be a part of the community. Take the hallway track at conferences and be involved in your developer’s group. It matters and it will make a huge difference when you’re looking for a job.
Be involved in writing good content for the community.
I can’t tell you how many times I heard “I read your blog and…” as an introduction from a possible employer. I probably responded to well over 100 leads just from this blog. Writing on your topic of expertise will draw attention. Make sure that if you’re a PHPer that you’re syndicated on Planet PHP and PHPDeveloper.org. And make sure that you put out good content.
Make use of social networks.
I was retweeted more than 200 times during my job search. This process was helpful because ultimately it was a retweet that got me the interview that got me the job. Make use of the social networks that you are a part of – Twitter seemed to be most effective.
People are generally good and are willing to help you.
The PHP community is full of good people. They’re willing to help you, if you need it, and if you’re deserving. Without asking, I had lots and lots of people retweeting me, and the goodwill that was generated made me very proud to be a member of this community.
And now…the moment you’ve all been waiting for…
As a component of my job search I offered a $100 gift card to anyone who referred me to a company that hired me. Though I never posted any official rules, the unofficial rules that I had were as follows:
I had lots of referrals, but only one led to a job offer. Even though I declined that offer, I still feel it is appropriate to award the $100 Amazon gift certificate to this individual. This person is none other than Cal Evans, who unfortunately is also looking for a job himself at the moment (p.s. hire him!). Congratulations, Cal.
Thank you to everyone who helped me during my job search. I appreciate each and every one of you and your contributions. Ultimately, I have a job now because of the community. And I can’t thank you enough.
In the last two entries we have talked about the concept of layer abstraction: that is, that exceptions should not be allowed to pass out of one layer and into another. So, when an exception is raised in the database layer it should be caught in the controller. But how do we go about making sure that exceptions raised in the database layer are properly recorded and processed, ensuring that we have error logging and don’t simply silence our exceptions?
There are a number of ways to encapsulate one exception within another, or “nest” our exceptions. Let’s discuss the ways.
Nesting Exception Messages
The base Exception class has a built-in __toString() magic method, meaning that we can automatically convert our exceptions into strings. This makes the following possible:
try {
throw new Exception('Message 1');
} catch (Exception $e) {
$e = new Exception('Message 2: ' . $e);
}
try {
throw $e;
} catch (Exception $e) {
$e = new Exception('Message 3: ' . $e);
}
try {
throw $e;
} catch (Exception $e) {
throw new Exception('Message 4: ' . $e);
}
We get back the following exception error message:
Exception: Message 4: exception ‘Exception’ with message ‘Message 3: exception ‘Exception’ with message ‘Message 2: exception ‘Exception’ with message ‘Message 1′The messages are nested, but this does generate some potential issues. First, our exceptions do not have stack traces; the logged stack trace will be the stack trace given in the final exception. Second, the exception message doesn’t contain the exception message codes. However, for simple nested exceptions, this is a good strategy.
Extending Base Exception And Writing Nesting Code
It is possible in PHP to nest exceptions by writing code to do so. For example:
<?php
class MyException extends Exception
{
protected $priorException;
public function __construct($message, $code = 0, Exception $previous = null)
{
$this->priorException = $previous;
parent::__construct($message, $code);
}
public function getPrior()
{
return $this->priorException;
}
}
This exception will take a third optional argument of a previous exception, allowing you to nest the exceptions. When preparing to log your exception, you can opt to iterate through any possible previously thrown and nested exceptions, and log any of the data you need.
Using PHP 5.3’s Built-In Nested Exceptions
Anyone that noticed the kludgy way I named the $priorException variable probably wondered why; the reason is that PHP 5.3 introduces nested exceptions as a default part of the PHP base Exception class. While the above code will work, if you are utilizing PHP 5.3, you can pass any previous exception as a third argument (like above), and use the Exception::getPrevious() method to get a previously raised exception.
It would still be wise to develop code that would iterate through the exceptions and log the various data you need; PHP doesn’t incorporate a way to automatically do this.
Summary
You can honor layer abstraction and still ensure that the errors raised are logged and handled appropriately using the various exception nesting techniques above. Ultimately, as PHP improves, so will the nesting options in exceptions.
On Monday, we talked about the basics of exceptions and how they are used in PHP (as well as in other object-oriented programming languages). As promised, today we are going to talk about extending the base exception class in PHP.
One of the things that you can (and should) do with PHP exceptions is extend them to suit your own purposes. While the base Exception class in PHP is neither abstract nor impossible to use on its own, extending exceptions give you a great amount of flexibility and power, particularly in three areas: customization, identification, and abstraction.
First, a little bit on how to expand and extend exceptions. Exceptions are extended just like any other class using the extends keyword. There aren’t a whole lot of methods we can override in the base Exception class, as most are defined as final; however, this does not prohibit us from adding our own methods. For more on the Exception API, please check out the details in the manual.
Extending Exceptions For Customization
When you extend the base exception, you can customize the exception to suit your needs. For example, I always include a setUserMessage() and getUserMessage() in my exceptions, so that when the view gets the exception, it can display a nicely formatted user-friendly error message without a bunch of junk like the stack trace.
class MyException extends Exception
{
protected $userMessage;
public function setUserMessage($message)
{
$this->userMessage = $message;
}
public function getUserMessage()
{
return $this->userMessage;
}
}
In the example above, we have successfully extended the Exception class and added our own methods. This adds a level of custom code to the exception. We can change a number of performance behaviors as well (we could theoretically add a __destruct() method that logs the exception to a custom error log, for example).
Extending Exceptions for Identification
When a generic exception is thrown in your application, you have to read through the stack trace to determine where and what caused that exception to be thrown. This can be a time-consuming process, especially if your stack trace is exceptionally large (pun intended). Instead of doing this, we can extend the Exception base class and name our new exceptions things that are easy to identify.
class ControllerException extends Exception {}
class ActionException extends Exception {}
class ModelException extends Exception {}
class ViewException extends Exception {}
By doing what we’ve done above, we’ve now made it easy to identify from which part of our application the exception comes from. If you have an exception raised that is of type ModelException, you know that the exception came out of the model, and you can ignore the view or the controller.
It’s perfectly acceptable to extend exceptions and not add methods to the extended exception. You can opt to extend the Exception base class into another base class, where you define a variety of custom methods, and then extend from that, like so:
class MyException extends Exception
{
protected $userMessage;
public function setUserMessage($message)
{
$this->userMessage = $message;
}
public function getUserMessage()
{
return $this->userMessage;
}
}
class ControllerException extends MyException {}
class ActionException extends MyException {}
class ModelException extends MyException {}
class ViewException extends MyException {}
In this case, each of our new exceptions will have a setUserMessage() and getUserMessage() method.
Extending Exceptions for Abstraction
One of the most important principles in object-oriented programming is abstraction, and specifically, layer abstraction. Layer abstraction is the abstraction of each layer from the other layers, like an onion. Each layer should have its own exception type.
For example, a request that goes into the Controller might get sent to the Model. If an exception is raised in the Model, it should not be output in the View. It should instead be captured in the Controller and handled, or a new Controller-type exception should be thrown that wraps the exception raised in the Model.
The same applies for l
Truncated by Planet PHP, read more at the original (another 892 bytes)
A great feature of PHP is the ability to throw and catch exceptions. This feature was introduced in PHP 5, and has been around for years in other languages like Python.
Exceptions make it easy to interrupt program flow in the event that something goes wrong. They allow you to customize how a program handles errors, and gracefully degrades an application. This week, we will discuss various exception handling techniques, and today we will discuss the basic dos and don’ts for exceptions.
First, what is an exception? An exception is an object that is “thrown” by your application. When an exception is thrown, it halts processing until the exception is either caught, or left unhandled. To throw an exception, you use the following syntax:
<?php
throw new Exception('my exception message');
There are a couple of things at work here: first, we are using the “new Exception” syntax to instantiate a new instance of the built-in Exception class. Second, we are using a special keyword in PHP called “throw” which allows for an exception to be placed onto the stack.
If left like this, the exception thrown above will bubble up and cause processing to halt at the point when the exception is raised. This is typical error behavior, but what makes exceptions special (and useful) is the ability to “catch” them.
<?php
try {
throw new Exception('my exception message');
}
catch (Exception $e)
{
// do some sort of error handling here
}
Catching exceptions allows us to try and recover from the error, or allow our application to degrade gracefully. In production, unhandled exceptions will cause the page to stop loading (or not load at all), but handled exceptions allow you the ability to redirect a user to an error page or do other error handling.
That is the basic syntax for using exceptions. But when should you use them and under what conditions? Here are some tips for making proper use of exceptions:
Exceptions are a part of object-oriented programming.
This may well be the most controversial point of this blog entry, but objects are really best used and should mostly be used with object-oriented programming. The exception itself is an object. PHP offers a number of error raising options that I recommend for use in procedural code, but exceptions should mostly be used with objects.
Extending exceptions is cool and encouraged.
As a developer you are allowed and encouraged to extend the base exception class on your own to create custom exceptions. These custom exceptions need not implement any custom methods; instead, you can use them to raise exceptions in different parts of your application. For example, you can raise a custom DatabaseException in the database class while raising a custom ActionException when actions are performed.
Exceptions can be extended like any other class:
class CustomException extends Exception {}
We can then throw CustomException. You can even further extend CustomException (for example if you want to implement certain custom methods and then have other exceptions use those methods). Note that in order to throw something it must extend the base Exception class; otherwise PHP will not allow it to be thrown.
Be sure that your exceptions honor layer abstraction.
One of the more complicated things about handling exceptions is that you want to honor layer abstraction when throwing and catching exceptions.
For example, let’s say that PDO raises an exception due to a unique key constraint in the database. Unhandled, it will bubble up to the top. If the PDO exception was caused by something in your Controller, allowing the PDO exception to bubble up would be a violation of layer abstraction.
A better choice would be to catch the PDO exception and wrap it in a Controller exception. For example:
<?php
try {
// some PDO action here
}
catch(PDOException $pdoE)
{
throw new ControllerException('There was an error: ' . $pdoE->getMessage() );
}
When the exception bubbles up, the PDO exception will have been handled, but the message will be included in a ControllerException. This is an acceptable way to handle exceptions that honors the principle of layer abstraction.
Truncated by Planet PHP, read more at the original (another 5386 bytes)
Lots of people have the itch to write their own frameworks. They think that they can do better than Zend, Cake, Symfony, or application-level frameworks like Drupal. They’re convinced that those designers and developers made fatal flaws, and they can improve upon them. They’re just itching to give it a shot.
So for those of you wanting to write your own frameworks, feel free. But don’t even think about putting it in production until you’ve read this blog post.
Lots of times new developers are shot down from writing their own frameworks by bosses or community members who insist that “NIH” (not invented here) has no place in their organization or language. They’re told that the existing frameworks are the “gold standard” and that they should take the time to learn those, instead of toying with their own.
To me, that doesn’t make a whole lot of sense. When comp sci students are attending school, they (hopefully) learn C, even though that there are higher level languages that do many of the low-level functions of C automatically. Why do they do that? Because they need to learn the basics.
Elizabeth Smith says basically the same thing about frameworks in a tweet, when she said “I think every PHP developer should write a framework – not necessarily to use but to LEARN from, it’s amazing how many don’t know basics.” This drives home a very important point: writing your own framework is a fabulous learning experience.
Writing your own framework will force you to make architecture choices that will in turn make it easier to see the wisdom (or lack thereof) in other frameworks. It also forces you to make decisions that you might otherwise have thought you could avoid: architectural decisions, coding decisions, lack-of-time decisions. Suddenly, the other frameworks don’t seem so kludgy and awkward.
Programmers should be wary of putting their frameworks into production, but should be willing and able to write one if the situation calls for it. The tried, true and tested frameworks that are out there are more than sufficient to meet most needs, and anyone who writes a framework will quickly learn that the decisions of the framework authors are common solutions to difficult problems, and should not be taken lightly.
So feel free to write your own framework and learn from the experience. I know I have – I called mine Modus.
Nearly a decade ago, Joel Spolsky came up with a method by which to evaluate software development shops that has come to be known as the Joel Test. This crucial test evaluates a software development company on the basis of twelve criteria points; Spoksly said that “a score of 12 is perfect, 11 is tolerable, but 10 or lower and you’ve got serious problems. The truth is that most software organizations are running with a score of 2 or 3, and they need serious help, because companies like Microsoft run at 12 full-time.”
When Joel wrote the test, there wasn’t much development for the web; the little that was being done wasn’t being done in any of the modern languages that we write in today. In fact, Facebook, Myspace, Twitter, Gmail, and LinkedIn hadn’t even been invented yet.
Today’s world makes heavy use of web-based software (the term “software as as a service” keeps floating around). And so, it is necessary to update Joel’s test in order to properly apply it to web development. This has been done by some folks, but I will do it again, mostly because I disagree with them.
Note: I will not be reprinting the original Joel Test; you can read that on your own.
The Joel Test – 2009 Edition
Again, just like the original, you can answer a quick yes or no to these questions in about five minutes. You should know how your operation scores without doing much deep thinking.
Let’s look at each point separately.
1. Do you use source control?
This is completely unchanged from the original test, because it is one of the most crucial components of professional software development. Whether you’re a team of one or one hundred, you need source control. Since 2000, lots of new options have been introduced: Mercurial, Git, Subversion, to name a few. If you’re not running version control you may as well hang a shingle that says “your code is not safe with us.” Because it isn’t.
2. Can you make a shippable version of your software in one step?
There’s a lot of argument about how or whether this applies to web development. It does.
Web developers need to have a build process in place. That build process needs to do things like run the unit tests, create a tarball or package version for distribution, and run other tasks. My personal build process builds a manifest of all the classes for the autoloader and strips comments out of my code.
For PHP, we have a number of tools available to us. My personal preference is Phing, but that’s only because I’ve used it; some prefer Apache Ant. Whichever you use, you need a build system.
Someone will inevitably argue that the nature of the web allows for deployment of code without creating a build. This is true, but it also lends itself to bad development practices. If you simply do an “svn up” whenever you’re ready to release new features, you never actually go through the process of reviewing, tagging and releasing something; you’re always updating from trunk, and this makes it hard to version and fix bugs. Yes, there is some overhead incurred, but this is necessary to the art of software development.
Do you use continuous integration?
The time to find out that one of your developers broke the build or failed a unit test is within hours of their commit, not at 2 am in the morning on the day after the release was due at 5 pm.
Continuous integration (there’s phpUnderControl and "/>
Truncated by Planet PHP, read more at the original (another 9788 bytes)
One thing I’ve noticed in hunting for a job recently is the number of companies that insist that you write them a code sample to spec. Not just any code sample, but a fully functional, complete application. This is absurd, for several reasons.
Eli White spends a good deal of time arguing why coding tests are bad. I won’t rehash that here.
My biggest pet peeve with these types of tests is this: there are a lot of companies out there, and I’m sending out resumes to each one that I can find. I simply do not have the time to write fifteen code samples a day, just because you want to evaluate me against your coding test. Period.
Joel Spolsky talks about how it’s important to ask developers to write code during their interview. But he also goes into detail about how he interprets this in his own company: namely, he asks people to write functions during the interview. He doesn’t administer a coding test. He wants to see how people think, how they respond to critique, and how they come up with solutions. He likens the coding test to asking a plumber or an electrician to provide some sort of proof that they’re capable.
But asking a programmer to write a complete application for you is like asking an electrician to build a small electrical grid for you before they work on your house. Can you imagine if every plumber had to prove that they could snake pipes before they could work on someone’s plumbing? Plumbers would be even harder to get to your house than they are today.
Asking a developer to write an application for you before you’ve made a formal offer of employment is also disingenuous, and disrespectful. It’s disingenuous to the value of their time, and disrespectful, especially to mid- and senior-level candidates, because it assumes that they don’t have an understanding of the language and they must prove to you otherwise. A coding sample, in the case of a mid-level developer, or a quick few phone calls, in the case of a senior developer, should establish pretty quickly as to whether or not they’re qualified.
A coding test also doesn’t identify the most important aspect of a programmer: whether or not they’re capable of learning. A fantastic test might represent their place now, but it doesn’t represent their ability to adapt to changing environments. The number of frameworks and projects in PHP should be evidence enough that the language is constantly evolving, and that we’re going to need people who can learn and adapt.
For those who administer coding tests, please reconsider what you’re doing. For junior developers, they’re fabulous tools. But for anyone who has a body of code samples all ready to go, ask for one. Ask a few technical questions that require code in an interview. And make your decision based on how they think, how they learn, and how competent they are.
Last week I wrote about five tips to improve object-oriented code. This generated a number of important questions, which I will attempt to answer for those who asked them.
“Often times when a developer gives each object only one responsibility, they tightly couple objects together.” Can you explain?
There are two major pitfalls in object-oriented programming: trying to do too much with an object, and trying to couple a number of objects too closely together. For this example, we’ll use the engine metaphor.
<?php
class Engine {
protected $crankshaft;
protected $pistons;
protected $radiator;
public function __construct()
{
$this->crankshaft = new Crankshaft($this);
$this->pistons[] = new Piston($this);
$this->pistons[] = new Piston($this);
$this->pistons[] = new Piston($this);
$this->pistons[] = new Piston($this);
$this->radiator = new Radiator($this);
}
}
In the above example, the Engine has one job: to direct the function of the crankshaft, pistons and radiator in order to move an automobile along. But we have a couple of fatal flaws. The first fatal flaw is that we pass the Crankshaft, Pistons and Radiator a copy of the Engine; while this might make development easy at some point (because the Radiator can send messages to the Engine) it is a poor design. It’s a poor design because the Engine controls the components, not the other way around. Additionally, we are instantiating the Crankshaft, Pistons and Radiator inside our constructor method, which means that the same Crankshaft, Piston and Radiator classes be available wherever we go. Let’s decouple this and fix it.
<?php
class Engine {
protected $crankshaft;
protected $pistons;
protected $radiator;
public function __construct(CrankshaftI $crankshaft, array $pistons, RadiatorI $radiator)
{
$this->crankshaft = $crankshaft;
foreach($pistons as $piston) {
if(!($piston instanceof PistonI) {
throw new EngineException('Improper piston type used');
}
$this->pistons[] = $piston;
}
$this->radiator = $radiator
}
}
How is this an improvement? First, we make use of dependency injection in our constructor: rather than automatically creating new objects, we expect that the objects have already been created and are being “installed” (injected) into our Engine. Second, we require that the objects utilize a certain interface. Now, some might argue that this is still tight coupling but I disagree: while you must have the interface available to you, interfaces do not define function. They simply define the methods that are available and must have been defined.
In this case, you can pull these interfaces and the Engine class out and build the library elsewhere, so long as you include the various methods of CrankshaftI, PistonI and RadiatorI. This is acceptable because every Piston will have certain methods (fire, upSwing, downSwing, injectFuel) but the functionality can be different depending on the type of piston you want.
This also gives every object its own job. The Piston is only responsible for doing Piston-related tasks; the Engine is responsible for controlling the Piston, but not executing its specific job functions. The Piston doesn’t have to be smart about the Engine or know about the Radiator; it only needs to know about its job. And so, each object has only one clearly defined role.
I do not know about dependency injection – do you have any links that do not require subscription?
I’ll talk about Dependency Injection here. It’s actually a really simple topic.
Since PHP 5, when you pass an object into a function, class, or by value, what you’re actually doing is passing it in a reference-like state. I say “reference-like” because it behaves somewhat like a reference but somewhat not (to learn more read this). When you act on the object, you change the object globally – that is, all instances of the object are changed. This is because when you pass an object by value you’re not copying the object; you’re simply passing the internal PHP value tha
Truncated by Planet PHP, read more at the original (another 3230 bytes)
Last week, I did a talk at the Frederick Web meetup about tips and tricks for improving your object-oriented code. A lot of these tips were adapted from a fabulous presentation by Stefan Priebsch but the ideas are by no means original to him, and they’re exceptionally good ideas when you’re talking about object-oriented code. Slides are at the end of this blog post, and I’m happy to do this talk over again for local groups.
#1 Use Objects. Lots of Objects
This point is taken directly out of Stefan’s slides, because it’s such a good point. There seems to be a perception in the PHP world that using lots of objects is slow, cumbersome, or plain difficult to maintain. But the reality is that this is not true at all (for example the object model in PHP 5.3 is vastly improved over older models).
By using lots of objects, you can make sure that each object has one job and only one job. You don’t have to make objects smart, either; instead you can rely on other objects to do work for you, and give you the things that you need. Adding lots of objects makes it easy, once you understand the architecture, to make changes without having to change massive amounts of code. The best object-oriented framework I ever worked on often required only one or two line changes to make massive improvements in performance, features, or the resolution of bugs.
Hardware is inexpensive, and there are options to optimize and improve your performance. You should make use of them.
#2 Use Interfaces To Make APIs Predictable
Interfaces are a great way to enforce a design. The concept of “design by contract” is the point of interfaces: you establish your contract, and you then use it.
Interfaces allow for strict typehinting, and by typehinting you can ensure that certain methods are always available for your use. Beyond that, interfaces make your API consistent, which is a big boon, especially as your team gets larger. Each interface will provide the model to utilize, and your teams will know that they have to implement a few methods each time they wish to make use of a particular interface.
#3 Use Dependency Injection
For the longest time I didn’t realize the importance of dependency injection. But dependency injection is critically important, especially if you want to write or utilize a framework or do any kind of unit testing.
Many programmers attempt to instantiate objects directly in their code, or grab objects out of singletons. This is a bad approach, because it makes testing impossible. You cannot inject a mock object; you also have trouble with mock data. For example, if you create your database object inside your controller object, you always have to create a database to test with, which adds another variable for possible failure of your unit test.
Learn to use dependency injection. It will make testing and feature addition easier.
#4 Composition Over Inheritance
There are two concepts that are critical in object-oriented programming: the concept of an object having an “is-a” relationship versus a “has-a” relationship.
For example, an apple “is a” fruit, while an apple “has a” seed. You would never say that a fruit could be a seed, but you could say that a fruit could be an apple. This distinction is important.
Stefan mentions that well-written classes can be extended no matter the circumstances, and I think he’s right. I also know that the more simple your classes, the better off they are in terms of abstraction. Always carefully consider whether or not objects are doing their jobs and what those jobs are.
#5 Create Loosely Coupled Classes
Often times when a developer gives each object only one responsibility, they tightly couple objects together. This is a mistake, because it
Truncated by Planet PHP, read more at the original (another 1099 bytes)
There seem to be lots and lots of PHP folks out there looking to hire good PHP developers. Finding the right developer can be a challenge, as can finding the right job.
I’ve been looking for a couple weeks now, and I wanted to put together a short blurb on why you should consider hiring me to be on your PHP development team.
I’m a self-taught PHP developer with five years of experience. That means two things: first, I’m curious by nature. Second, I’ve worked hard to get where I am. I’ve seen just about everything you can encounter, from basic database applications to complex web applications. I’m proficient in object-oriented programming, I’ve toyed with things like Phing, Propel, Zend Framework, Drupal, Wordpress, Symfony and others.
As a freelancer I’ve gained valuable time management and team management skills – I managed a small team of two developers on two freelance projects that were large enough to require additional teams. I’ve also learned a good deal about business – a skill that you sometimes might want married with a PHP developer.
But beyond my resume there are some soft skills you should consider: I’m motivated to learn more. I taught myself PHP which means I’m capable of teaching myself other languages. I’m a well-known writer on PHP topics (15,000 unique visitors to this blog this month) who will be published in the near future. I’m extremely active in the PHP community, serving as a leader of the local DC PHP developer’s group. I’m well connected to those who are developing the tools we use, and I’m focused on making the open source world a better place to be.
Feel free to ask around about me: Cal Evans or Keith Casey would probably be happy to tell you that they know me and of the things that I am involved in.
If you’re looking for a PHP developer, I’m looking for a PHP position. Send me a note and let’s chat.
Last week I wrote about some optimizations you can apply to your code that will improve the performance of your site significantly. I also mentioned that regularly an article pops up talking about ways to shave time off your scripts, and I talked about how these articles mostly are bunk. Like this one.
The article I linked above is a run-of-the-mill micro optimization list. The difference here is that the author actually makes use of some benchmarks to make their point. So, let’s go step by step and discover together why this article takes longer to read than the amount of CPU time it saves.
Loops
The author asserts that it is best to calculate the maximum value for a for loop outside of the declaration of the loop. Inadvertently, the author stumbles upon a tried-and-true programming technique: don’t repeat yourself.
The code sample:
#Worst than foreach and while loop
for($i =0; $i < count($array);$i++){
echo 'This is bad, my friend';
}
#Better than foreach and while loop
$total = (int)count($array);
for($i =0; $i < $total;$i++){
echo 'This is great, my friend';
}
This is a true “duh!” moment. Of course a loop is faster when you don’t run a function each time you iterate through it! But bear in mind that if you think a “for” loop is the fastest loop available, you’d be just as surprised as Sara Golemon, one of the internals developers, to find out that it is not.
Loops are a necessary part of programming, even nested loops sometimes (which he argues against). Don’t forgo loops just because you’re worried about performance. If you use them right, performance won’t even be a factor.
Single Vs. Double Quotes
This is one of the two
">big
">PHP
">optimization
">nightmares
">that just won’t die. The argument goes like this: because PHP has to parse a double-quoted string twice looking for variables, it is inherently slower than a single quoted string.
The argument isn’t necessarily true.
I did my own benchmark this morning on PHP 5.2.10, and discovered something interesting: when PHP evaluates a simple string that is double quoted and a simple string that is single quoted (that is, there is not a variable in either string), the single quoted string actually runs slower than the double-quoted string. Yep. I’m serious.
Double Quotes Test: 11.838791131973That is my average for five runs of 10,000,000 iterations over a single quoted string and a double-quoted string.
When I added a variable (concatenated on the single-quoted string), the two performed as expected: the double-quoted string performed slower.
What does this mean? It means that PHP appears to be smart enough not to parse the double-quoted string twice, if it doesn’t have to. PHP seems to optimize for you, meaning you don’t have to optimize yourself. It’s also worth noting that in 10,000,000 iterations, the average difference between the two was 3/10ths of a second. If you’re trying to save 3/10ths of a second, you may have other areas worth refactoring.
Pre-Increment Versus Post-Increment
I did a benchmark of this micro-optimization tip. And I found that in fact, pre-increment is actually faster than post increment. By 5% in fact. That’s a major performance boost, right?
Wrong.
The amount of time I saved on average was 5/100ths of a second. That’s not even enough time for me to have typed the last sentence, or for you to have read it. And I was doing 10,000,000 operations.
The benchmark looks great – a 5% increase in performance – until you realize that all benchmarks are subject to possible fallacy.
Absolute Pa
Truncated by Planet PHP, read more at the original (another 8548 bytes)
A few weeks ago, Packt Publishing sent me a review copy of PHP Team Development. This free copy of the book was sent to me just in time for my vacation, and I had a chance to read it.
Unfortunately, I was largely disappointed by this book. I join in many of the comments by Lorna Jane in the book’s writing, and I would add in that much of the text seemed simple. The author is trying to do too much, as well, in focusing on a number of topics that have sparked volumes in and of themselves (Agile programming, anyone?).
The book does make a good beginner book, but is not very telling for anyone that is using PHP and has been doing so a while. It seems to be more a collection of best practices that you should learn as a junior developer than a book for anyone who is actually trying to manage a team. I’d give it to a beginner but not to an advanced developer.
One of my biggest complaints is also that the author takes a firm position without exploring the other options available. Agile programming is not considered to be the One True Way in PHP; far from it. There are lots of development options in PHP (Test-Driven Development being one of them that I prefer), and I think that this is severely lacking. In fact, unit testing isn’t even in the index, even though the description says testing is discussed. This is a major faux pas.
The book also focuses heavily on MVC as the design pattern to Solve All Your Problems ™. While MVC has many uses, it is not the be-all pattern for every problem. This strikes me as a design pattern fallacy and flaw that I’d like to see corrected in future releases.
As a final note, the book is also a paltry 161 pages long, which is not nearly enough to cover the material that the author is trying to cover in any sort of useful detail, and certainly not worth the $32 price, at least to me. I was disappointed getting this book in the mail, and this was a review copy; I cannot imagine if I had spent my money on it.
Time and time again, I come across code that contains a variety of array-handling functions that too often duplicate the work that the PHP core team has done to develop built-in array functions. Since the built-in functions are inherently faster, trying to reimplement them in PHP will inevitably be a performance problem.
Here are five of my favorite array functions, along with their signatures and what they do.
array_key_exists(mixed $key, array $array)
Anyone who has ever had to search an array to see if a key existed can certainly make use of this function. Many times I simply use isset($array['key']) as a replacement, but for small arrays (or to be explicit about what you’re doing) you should learn to use this function. There’s never a reason to duplicate this function. If you want to check the array to see if a particular value exists, use in_array().
usort(array $array, callback $callback_function)
PHP offers a whole list of array sorting functions. This extensive list provides a function for almost every occasion. But what about the times you want to sort an array and have special needs? usort() comes in handy, because it lets you define a user function (one you write) as the sorting function, and call a built-in PHP function to actually do the sorting.
For those that don’t know, a callback function is a user-defined (or PHP included) function name that you pass to a function as a string. This callback is executed by the function you call. In this case, usort() will pass the array as the single argument to the function you define.
array_pop(array &$array)
This is a cool function. If you’ve ever had a need to get the last element off an array, this is the way to do it. I’ve seen this code replicated a hundred times, but array_pop() is fast, efficient and built-in.
This function takes an array as the argument, and then finds the very last element and pops it off the end. Note that this changes the original array, because the array is passed by reference to the array_pop() function.
<?php
$array = array('apple', 'raspberry', 'banana');
$fruit = array_pop($array);
echo $fruit; // Outputs 'banana'
echo count($array); // Outputs '2'
?>
array_merge(array $array1 [, array $array2 [, array $... ]] )
Combining two arrays can be difficult but this built-in PHP function does a fabulous job of making it easy. This function takes a number of arrays and returns one big array containing all of their keys and values. It’s worth noting that if two arrays contain the same key as a string, the last array combined into the master array will be the value that is returned. Numerical keys are not affected.
<?php
$array1 = array('apple', 'blueberry');
$array2 = array('pear', 'banana');
$array = array_merge($array1, $array2);
var_dump($array);
The output of the above code is:
array(4) {array_rand(array $input [, int $num_req = 1 ])
A long time ago I needed to get a random value out of an array of quotes. The code I came up with looked something like this:
<?php
function getRandom(&amp;amp;amp;$array)
{
$count = count($array);
$key = rand(0, $count);
$quote = $array[$key];
unset($array[$key]);
return $quote;
}
This function works, but it’s so much simpler to do just this:
function getRandom(&amp;amp;amp;$array)
{
$key = array_rand($array);
$quote = $array[$key];
unset($array[$key]);
return $quote;
}
This does exactly the same thing while removing an extra function call that was ultimately unnecessary. While this little performance boost might not ultimately be too great, using the built-in function is much cleaner, much clearer, and much preferred.
What are your favorite array functions?
This blog entry implements The Beginner Pattern.
A blog post over at Object Mentor argues that technical debt and a mess are not necessarily the same thing. This well written blog post discusses the difference, and asserts that taking out technical debt is like taking out a mortgage: that you increase your discipline, rather than decreasing your financial discipline. The same should be true of technical debt, then.
I would tend to agree that a mess does not constitute technical debt. A mess is just a mess, most of the time. Writing poor code or having a project filled with messy solutions doesn’t incur technical debt; it pushes you towards technical bankruptcy.
However, messes are not always simply messes. Sometimes you have the choice between doing it quickly and doing it well.. This isn’t an optimal situation; however, it is an intentional design choice on your part. Writing a hack incurs technical debt, and hacks are almost always ugly. If you had time to write it correctly, you would write it cleanly, and it would incur less technical debt.
Technical debt accrues based off of your own choices. Just like how you don’t accidentally get into debt with finance, you can’t accidentally accrue technical debt. Technical debt is a choice, requiring you to decide between two options. Implementing a messy solution is a choice as well; doing so will only increase your technical debt, in much the same way that adding closing costs to the overall mortgage amount will increase your debt.
The accrual of technical debt is a necessity, especially when deadlines are considered and people have to make quick decisions. But technical debt is a choice; you do not accrue it accidentally.
Every few weeks, someone publishes an article talking about how it’s faster to use single quotes rather than double quotes and how you should use echo() instead of print(). Most of these are bunk; that is, the time we spend talking about them far exceed the CPU time saved by implementing them.
Micro optimization doesn’t work. So why, then, is this post called “micro optimizations that matter”? The optimizations below could be described as micro – not in the little amounts of performance improved, but in the very minute (if any) changes required to your code to make use of them. All of these optimizations are standard optimizations you should consider, and all of them will offer considerable performance enhancements.
Caching
The fastest, easiest, and least painful way to get a performance boost is to enable caching on your server. There are a number of caches that are available to you.
First, if your database query cache is turned off, turn it on. For MySQL, that means taking a look at the documentation and setting up the query cache to cache all queries. This helps because it prevents the database from having to rerun queries. You should also have an opcode cache installed. I like APC. APC will automatically cache the opcodes from the compilation of your scripts. There are ways to boost the the performance of APC that you can investigate as well.
Neither of those to suggestions requires any code changes on your part, but will yield improvements in performance, sometimes as great as 300% (where APC is concerned). These “micro optimizations” are crucial. There are also some things you can do with caching that will require code changes but will make your application better for them.
The first is to enable the use of either APC or Memcached to cache objects and data points. For example, there’s no reason you should even be asking the database (regardless of the enabling of the query cache) to generate your blog post list every time someone visits your blog. Put that in the cache. You can even put your sessions into memcached to eliminate disk IO. These will require some code changes, but will be worth it.
Eliminate Any Sort Of Logged Errors
Disk IO is one of the things that can kill your application’s performance. Disks are usually very slow; people pay attention to how large a disk is but not how fast it is, and memory is always faster. Unfortunately, one of the ways that people kill their apps is by logging unnecessary warnings, errors and notices. I say “unnecessary” because they’re things that should have been resolved before the application went into production. Make sure that you get rid of notices that are avoidable and only have errors raised when something truly does go wrong.
Enable Output Buffering For Everything
It is possible to set an INI command that will enable output buffering on all of your pages. This is a good thing, because it means that Apache will get the parsed version of your PHP application as a chunk, rather than piecemeal, improving performance and reducing system calls (see this PDF for more).
You can turn on output buffering in the php.ini file with the following directive:
output_buffering = OnWill output buffering solve your problems if your application is resource-intensive or badly written? No. But it will help improve the performance of a well-written application.
Make Use Of A Content-Delivery Network
One of the fastest and easiest ways to reduce the total load time of a page and the load on your server is by moving things like images and videos to a dedicated server or content delivery network. For example, you can make use of Amazon’s S3 service. This low-cost service will allow you to have another server provide images, reducing the load on your own server quickly, cheaply, and without too much code modification. Less load on Apache means that you have the ability to serve more pages.
Truncated by Planet PHP, read more at the original (another 3543 bytes)
I went in to work yesterday morning to find out the sad news that the contract I had been hired to work on had been canceled last week. The company tried to find work for me, but was unable to do so. I am now another casualty of the struggling economy. It’s a shame, but I understand that the “last in first out” principle of employment, and I understand the business need.
The company was generous in the severance offered, and there is no malice towards them for what happened. As of today, I’m officially back on the open job market, and I’m looking for work as a PHP developer.
I’d prefer to stay in DC, but I’m willing to telecommute and travel occasionally for business. For those interested, you can contact me at resume@brandonsavage.net for a copy of my resume, and you may read my current online resume.
Working for Applied Security was a wonderful opportunity, and I’m sad to see it end, especially so quickly after starting. But I’ve enjoyed my time there, and I look forward to the new adventures that lie ahead.
PHP allows developers to write a variety of different styles of code: procedural, object-oriented, or simply scripts. This flexibility makes PHP easy to learn, and also means that new developers to PHP may not be programmers in other languages.
For new developers, especially developers who have never been programmers before, moving from writing simple scripts to writing functions is a process that takes time. I developed in PHP for years before I wrote a single function. I also never found a comprehensive tutorial on how functions work, or how to write them. There’s documentation in the manual, but it’s a bit hard to grasp if you’re new. This article is about writing functions.
For starters, what is a function? A function is a collection of code that is available for use and reuse repeatedly throughout a particular script or application. You’re probably already familiar with functions because you probably use them in PHP. If you’ve used print() or mysql_connect() you’ve used a function. Both of these functions encapsulate other code (in their case, core PHP code written in C), and allow you to accomplish certain tasks.
Matthew Turland equates writing a function to tasking a junior level employee. After describing a short name for a particular task, you teach the scope of that task, and then can direct the task to be executed at any time.
The beauty of functions is that you don’t have to rewrite the same code over and over again. Once it’s encapsulated in a function, that function provides a shortcut to executing the code you’ve defined. This is a pretty cool tool when you think about it.
For example, let’s say that you establish a MySQL connection every single page. You could do the following on every page:
$connection = mysql_connect('localhost', 'user', 'pass');
mysql_select_db('mydatabase');
Or, you can write a function that does this, and call the following on every page:
$connection = myConnectionFunc();
Which would be easier for you? Well, the second one of course. Let’s learn how we did that.
The source code behind myConnectionFunc() is pretty basic. If you don’t understand the syntax entirely from looking at it, don’t worry. I will explain.
<?php
function myConnectionFunc($host = 'localhost', $user = 'user', $pass = 'pass', $database = 'mydatabase')
{
$conn = mysql_connect($host, $user, $pass);
mysql_select_db($database);
return $conn;
}
That may look quite complicated but once you understand the syntax, it’s not all that complicated at all. We’ll come back to this example once we know a little bit more about functions.
All Functions Share These Characteristics
For starters, all functions share certain characteristics. The function definition starts with the word “function”, followed by the name of the function, an opening parenthesis, any arguments, and a closing parenthesis. You must also include curly braces to delimit the body of the function. Here is the most basic function you can have in PHP:
function basic()
{
}
All functions should share these qualities. They are required by the PHP engine to determine that a function has been defined. Without them, you will get a syntax error.
Function Bodies
A function body can contain any user-defined (that’s functions you write) or internal (that’s functions PHP provides by default) functions, structures, and code that you like. Remember: functions serve to encapsulate the functionality you want to express, meaning that a function can be called repeatedly to accomplish a certain task. For example, a function can look like this:
function iterateArray($array)
{
foreach($array as $item)
{
$newArray[] = 'Iterated: ' . $item;
}
return $newArray;
}
What you put in the body is largely up to you. There are few (if any) restrictions. Bear in mind, that there are best practices; you should make yourself aware of them through the "/>
Truncated by Planet PHP, read more at the original (another 11939 bytes)
Nearly five years ago I started writing PHP code for fun. I had a project that I was working on, and I needed some sort of a programming language that would do calculations for me, and hopefully make managing a website easier. So I wrote my first web application.
Boy, was it bad.
Looking back at it today, I have to laugh about the naive way I relied on things like register_globals and magic_quotes_gpc. Or about how I was frustrated by the fact that magic_quotes_gpc escaped things, and had to work my SQL queries so that they would work right. Or about how I used addslashes() to “escape” data.
Every day, new people join the PHP world, writing their first “hello world” script and moving on from there to connect to databases, build CRUDs, and otherwise explore the PHP language. If you’re one of them, you shouldn’t feel inadequate. No, learning PHP is a learning process. One of PHP’s strengths is that it is easy to learn, and that anyone can learn how to do it. Fewer can learn how to do it properly, but for those that do learn how to do it right, it can be a powerful language and a solid tool.
I’ve spent some time writing about beginner issues, and implementing The Beginner Pattern, because I think it’s important to help new developers to the community get better. But if you don’t understand everything, that’s ok. Ask for help, read the blogs and the manual, and keep writing code. A smart person once said that if you look at your code six months from now and think it’s ok, you’re doing something wrong.
In all of that, please also remember to be careful. PHP is a powerful language, with abilities that, if not checked with security concerns, can threaten an entire system. Learn all you can about security. Remember to Filter Input, Escape Output.
Those of us that have been doing PHP for a long time seem to have forgotten what it was like to be new at the language. For those that are new, please don’t become discouraged. PHP needs you, because the generation that comes next will replace the generation that is here now, and that’s how the project keeps moving forward.
A little while ago, I wrote an article discussing why interfaces rock and the way that interfaces work. However, a couple of comments made me realize that I didn’t discuss one of the key elements about interfaces: why you would use them.
One key rule about interfaces is that all methods must be defined as public. You cannot define protected or private methods, and you cannot define any members of any type. You may define constants, as these cannot be overridden by any class implementing the interface.
So why is an interface useful?
An interface is useful because it allows you to know what methods will be there. No matter what, when you get an object that implements a particular interface, it has to have implemented all the methods of that interface. What’s more, you also know that the interface is public, so each of those methods is public. Essentially, this means that an interface is for defining your public API.
Take the following example:
interface ActionI
{
public function initialize();
public function execute();
public function getResult();
public function setRequest();
}
Each and every one of those public methods will be available to every single object in our application. We know that any object implementing ActionI will have those methods defined, and they form the public API of our object. Regardless of what private or protected methods that we implement (the guts that make our object function properly), this is our public API.
An added benefit of doing this is that an interface definition allows you to document more effectively. Theoretically we can document just the four public API methods, and skip the protected or private methods (except for internal documentation), because anyone who is accessing our objects only needs to know about the four public ones. Only the developers who are modifying the internal workings of the objects care about the helper methods.
Interfaces make programming much easier, by allowing us to plan, and by giving us consistency in our public API, without requiring us to define much about the methods to do so.
It’s worth noting that this article contradicts something in my previous article, “Why Interfaces Rock”. That is that I defined a constructor method. When defining your public API, you do not want to define any methods that might be unnecessary in descendant objects. In practice, when you define your interface, make sure to only include the methods that will be necessary for all objects to implement; nothing more, nothing less.
With the introduction of PHP 5, the PHP Data Object was introduced as core functionality. PHP 5.1 turned on a minimum level of support for SQLite, by default, and PDO supports most of the major database engines. PDO offers a number of enhancements and improvements over the various database libraries (e.g. mysql_*, mysqli_*, pg_*), the biggest one being consistency. Still, the high level of code that involves direct use of the various database libraries means that PDO still isn’t as widely accepted as it should be.
This primer will show the various uses of PDO, and outline some of the benefits.
The following is a sample PDO transaction with a MySQL database:
<?php
$con = new PDO('mysql:host=localhost;dbname=bank', 'user', 'pass');
$con->beginTransaction();
try {
$stmt = $con->query('SELECT SUM(amount) FROM accounts');
$result = $stmt->fetch(PDO::);
$insert = $con->prepare('INSERT INTO total SET total = ?');
$insert->execute($result);
$con->commit();
} catch (PDOException $e) {
$con->rollBack();
}
Before going too far, with the exception of the very first line (the new PDO statement), this is the exact same syntax for PostgreSQL, SQLite, etc. Exactly the same. Why? Because PDO makes it easy to port from one database to another without too much headache. Now, let’s look at what we’ve done.
First, we create a PDO connection. This is pretty standard stuff. We use a DSN, which you can read about for the various database drivers. The next thing we do is we initiate a transaction – a boundary that makes all of our changes happen, or none of our changes happen. In our transaction, we’re not doing anything exciting, but if we were working on three or four dependent tables, we’d want to roll back and not have our changes applied if one table failed to work for some reason.
The next thing we do is we get back a PDO statement object. The statement object contains the information about the query we just executed. We get the result out of the statement object, and proceed to prepared statements. This is a bit strange at first, but when you think about it, it’s not so odd. Prepared statements offer many advantages: you can prepare the SQL once and reuse it over and over again, and PDO automatically escapes content for you, meaning you reduce the risk of SQL injection. Finally, we pass the insert statement an array of values to insert into the prepared SQL, and it executes that statement. Following successful completion of all operations, we commit the transaction, or roll it back on failure.
That may seem like a lot, but it’s not. Take, for example, the typical way of doing that with MySQL’s libraries in PHP:
<?php
$con = mysql_connect('localhost', 'user', 'pass');
mysql_select_db('bank', $con);
$sql = 'SELECT SUM(amount) FROM accounts';
$resource = mysql_query($sql, $con);
$result = mysql_fetch_row($resource);
$sql = 'INSERT INTO total SET total = ' . $result[0];
mysql_query($sql, $con);
That’s all we did (with the exception of adding the transaction, because the mysql_* library doesn’t support them). That’s it! Not too hard at all, right? No; PDO is just a different way of executing SQL queries and transactions.
PDO makes it easy to switch from database to database, and the syntax for PDO is the same regardless of the database you use. PDO takes care of escaping for your particular database, as well as allows you to prepare statements, even for databases that don’t allow them. PDO isn’t without bugs, but does provide a fantastic abstraction layer that aids in code development and makes portability easier (at least when doing simpler queries that don’t rely on lots of database-specific SQL functions).
Tomorrow morning, I head off for a weeklong trip to New England. This long-awaited vacation is something that has been in the works for a while.
Last year, I wrote about the importance of taking time. It is important for people to refresh themselves, and to enjoy some time away from their work to redefine their perspective, refresh their minds, and rejuvenate their spirits.
Joel Spolsky wrote about taking a sabbatical, and the importance of taking time for oneself. This is important because it helps to clear the mind, improve the spirit, and free the gunk from our lives that seems to bog us down.
I expect to come back rejuvenated, with more passion and increased appreciation for the work I do and the writers I serve here. I have not, however, abandoned you for next week. There are blog posts scheduled for publication next week, and they will be published, per the normal schedule.
So, have a great week, and see you in two.
Twitter has implemented the OAuth login system, allowing for users to centrally control what sites have access to their Twitter accounts, without having to share their passwords with the third parties. This improvement means that there is less risk of the full account credentials being used nefariously, since the user has to log into the session and explicitly authorize the behavior.
But this doesn’t mean that individuals are completely safe from nefarious behavior at the hands of third-party application providers.
Take for example Twibbon. Twibbon is a service that allows you to place a badge on your Twitter icon. Many of my followers have used Twibbon to decorate with sports teams, frameworks they prefer, or other icons. I even used it to add a Clemson tiger paw to my icon for a bit. But Twibbon is evil.
But Twibbon does some pretty uncool things. First, as soon as you add the icon they post a tweet “on your behalf” announcing that you use Twibbon and suggesting that your followers should, too. They do not, of course, give the option to opt out of this behavior. That’s strike one.
Strike two was the discovery today that Twibbon also adds themselves to your follower list. That’s right – without asking, they automatically follow themselves with your account. This behavior is not well disclosed, either, nor can you opt out.
But for the third strike, they had to go one step further and do something completely nefarious and rude: they also take the liberty of marking their Twitter updates as updates that should be sent out via SMS. I discovered this trick when I was examining the list of people that I follow. I don’t have any updates sent to me via SMS, except for direct messages, because I don’t like using my text messages when I can just read tweets on my iPhone for free (using Tweetie).
Technically, Twibbon discloses most of this behavior. In little tiny letters, they tell you that they are going to tweet on your behalf and have you follow them. But the do not disclose that they will be signing you up for SMS updates.
Services like Twibbon provide value to Twitter, but they cannot be allowed to simply opt you into their marketing schemes on a whim. Not when they’re given read and write access to your account. OAuth helps keep nefarious behavior in check, but doesn’t prevent it altogether. Twitter needs to do more to ensure that services like Twibbon disclose and allow for the opt-out of these kinds of actions.
The New York Times did a profile on the topic of pair programming, the art of writing software with a partner. They looked at it through the eyes of an individual who does pair programming every day.
The profile is pretty good, and makes a strong case for pair programming. While I’m not fully prepared to surrender my freedom to another person for 100% full-time pair programming, I think that doing pair programming is something that can be very effective.
One thing that the New York Times doesn’t really play up is that pair programming is good for management. This is sometimes lost on management, who wonders why they should use two programmers to do the work of one. They miss the point, though, when they do the math. First, a second programmer provides a second set of eyes, meaning that bugs are reduced. Fixing bugs takes time, and this reduction in bugs actually saves time. Second, management also misses the fact that two people put in eight hours of productivity each day, together, rather than perhaps four hours total, if they were programming separately. This is due to the fact that programmers get bored, get distracted, or generally get off task, but with another person there is a peer pressure to keep working.
At the end of the day, even if the lines of code written are lower, the problems solved, bugs avoided, and logic worked out is of higher quality and improved stability. For those who have never tried pair programming, I highly recommend it.
When setting up a web server with PHP, there are a number of settings that are critical to consider. PHP 5.3 contains both a development INI file and a production INI file; however, users of older PHP releases (or those who don’t have direct control over their INI files) will want to pay attention and make sure that certain settings are configured.
These settings are the settings that I use whenever I configure a PHP server.
register_globals = Off A holdover from ancient PHP versions, this setting is by default turned off and should remain that way. Even though it is disabled by default, many hosts enable it to continue supporting legacy code; in your own code, I recommend you set it to off. You can turn it off on a directory basis using htaccess files.
magic_quotes_gpc = Off This is another old holdover from PHP, and is slated for removal in PHP 6, along with register_globals. This automatically adds slashes to all GET, POST and COOKIE data, meaning that a posted string of “This is my code’s string” gets converted to “This is my code\’s string”. magic_quotes_gpc offers NO security; you should turn it off. It can be turned off per directory.
error_reporting = E_ALL | E_STRICT This is the strictest error setting you can insist on from PHP. Some people will disagree with the E_STRICT statement; I think that it’s important that our code conform to high code standards, and that means having E_STRICT turned on. This should be done even on production, because you want to know about errors you’re getting from your code, even if you didn’t see them during testing.
display_errors = Off Even though we want to have PHP raise errors, we don’t want them displayed to the end user! Turn display_errors to off and log the errors instead. You can configure the log path in each directory, and you should make use of this feature. Displaying errors to the end user is a security vulnerability, because it allows them to determine the operating system and file structure of your application.
session.gc_maxlifetime = 28800 This setting is how long a session is valid on your system. The default length of time is a paltry 1140 seconds, or 24 minutes. That means that someone reading a long article behind a login portal might get logged out after they’re finished. The setting I use is 8 hours, which is enough time for most users. You can also use four hours (14,400 seconds).
short_open_tag = 0 This is on by default, but using short tags (<?) is bad form. Turn it off and don’t utilize it. (Edit) There are two reasons for this. First, and the more rare reason, is that it could create problems with XML parsing. If you ever have the need to embed PHP in XML (as I did once), you may run into this. This is rare, but possibe. Second, and more common, is that if you ever change hosts, or start using a host that configures PHP for you, they may disable the short tags by default. This could leave you scrambling for a fix for your code. Changing 20,000 <? into <?php can take a long time (I’ve done it; it can be a pain).
upload_max_filesize = 10M & post_max_size = 11M If you do anything with file uploads, you’ll find that the default 2 MB is woefully inadequate. I set mine to at least 10 MB, to ensure that most files I want are uploaded. You’ll also want to up the post_max_size due to the fact that it is set to a default of 8 MB, which will cause your file upload to break.
How to Set PHP Directives In htaccess Files
Finally, a discussion of how to set these values: if you make use of a shared host, you won’t be able to get direct access to the php.ini file. There are two different htaccess directives you need to be aware of: php_flag and php_value. The use of php_flag is reserved for boolean values, like register_globals and magic_quotes_gpc. You use php_value for things that are not boolean, like error_reporting and error_log.
For example, the syntax for turning off register_globals is as follows:
php_flag register_globals OffThe syntax to set the error log file path is as follows:
php_value error_log /var/www/logs/php_errors.logThis blog entry implements The Beginner Pattern.
Earlier this week, I was contacted by Packt Publishing and asked to review one of their forthcoming books. I felt honored that they asked, and I’ll be reviewing the book on my blog in the next few weeks. For those who just can’t wait to see what they’re offering, you can check out the book, PHP Team Development.
In the interests of full disclosure, I received nothing in exchange for the review, except a free copy of the book (which is necessary in order to write a review).
When I first learned PHP 5’s object oriented syntax and rules, I didn’t see much of a point to the interface options. I felt that I could do more by defining abstract classes and at least filling in some of the methods with some details. Lots of people in the PHP world still aren’t 100% sure the reasons that interfaces exist, or the best way to use them. However, interfaces are very cool, and anyone who does OOP in PHP should know about them.
To start, what is an interface? An interface is a collection of completely abstract methods. Interfaces do not contain any of the innerworkings of the application; instead, they serve the sole purpose in setting a structure for the objects that implement them. All of their methods must be public. Here is a sample interface:
<?php
interface DatabaseI
{
public function __construct();
public function connect();
public function query();
}
That doesn’t look very interesting at all, and it’s not, until we start getting into type hinting and using instanceof in PHP 5. This is where the power of interfaces comes into play.
As a developer, I’ve often wished there was a way to know that an object – any object – had implemented certain methods. With type hinting and the instanceof operator, it is now possible to determine whether or not an object has those methods, and interfaces make this even easier. Take for example:
class withoutTypeHinting
{
public function __construct($databaseO)
{
if(method_exists('connect') {
$databaseO->connect();
}
}
}
// Now let's use typehinting from our previous example
class withTypeHinting
{
public function __construct(DatabaseI $databaseO)
{
$databaseO->connect();
}
}
What’s the difference there? In the first example, we’re hoping that the method exists so that we can call it. We’ll have to throw an exception or do something if the connection doesn’t actually exist. But in the second example, we’re demanding an object that has implemented the methods in DatabaseI. We know that these methods must have been implemented, because we’re getting an instantiated object. So the method must exist.
Interfaces make it much easier to predict what methods will be available, and also to ensure that subclasses still conform to some semblance of the structure you’ve defined.
Interfaces are not without their drawbacks. One huge drawback of interfaces is that you must implement them precisely as they are defined; that is, because our __construct() method contained no arguments, you must either make your __construct() argument contain no arguments or set each argument to a value (e.g. __construct($a = true, $b = false); // etc). When defining an abstract class, and abstract methods, the rules of inheritance apply, making it possible to redefine the argument list. Also, because you have not defined anything but the method signature, you have no way of forcing it to return a particular type. Finally, you may not build an interface containing protected methods; you must use an abstract class for this.
Still, interfaces rock. They allow you to enforce the type of object and the methods that must be defined, and with type hinting you can always feel confident that a method is implemented, regardless of how that interface is extended.
One of the most confusing things for new programmers (and it even trips me up sometimes) is how to test for boolean conditions in code. As developers, we want to develop code that never emits notices or warnings, and PHP gets a bit antsy when we develop code that utilizes uninitialized variables. Lucky for us, PHP makes it easy to test for these variables without getting these notices.
PHP (like most languages) evaluates a logical argument left to right. For an AND condition, both conditions have to be true; PHP stops evaluating if it finds any condition untrue; that means that we can use isset() or empty() as the first parameter in an if statement, and avoid raising notices.
For those who don’t know: isset() and empty() are two language constructs in PHP. A language construct is not a function, but a special language tool (like an if-else statement or a while loop), and in this case isset() and empty() allow us to test variables for various conditions.
When To Use isset()
The construct isset() tests for whether or not a variable exists in the present scope. If the variable has been defined in the global scope, in any form (except null), this will return true. Otherwise, it will return false.
You should use isset() when you want to determine if you wish to test that variable for another condition. For example:
<?php
if (isset($var) && $var > 6)
{
// do something here
}
if(isset($var) && $var === false)
{
// Do something else here.
}
?>
If the variable is not set, isset() will return false. This will cause the first condition to be false, and PHP will not evaluate the second condition (which would emit an error).
When To Use empty()
The empty() construct works in a very similar fashion to isset() but has a difference: you want to use empty() when you care whether the variable has a value that would evaluate to true.
Consider the following code sample:
$variable = false;
if(isset($variable))
{
echo 'This variable is set.';
}
if(!empty ($variable))
{
echo 'This variable is not empty.';
}
In the first example, isset() tells us that the variable is in fact set in the current scope. But in the second test, the variable is deemed empty, because the value is false. This is an extremely useful tool, but one that can be dangerous! False is an absolutely valid value (for example, you might set configuration options to false) but you may want to execute whatever is inside an if statement even if the value is false.
Additionally, empty strings, the values “false”, null and 0 all resolve to “true” when you use empty(). This can create problems that you’ll want to avoid, for example:
<?php
$string = "A string to test.";
$var = strpos($string, "A string"); // Returns integer 0 as the position it was found.
if(empty($var))
{
echo 'Not found!';
}
?>
This example illustrates a case where using empty() would create a problem. strpos() returns 0 if it finds the needle at the beginning of the string (position 0). However, if it does not find it at all, it returns false. However, false and 0 both evaluate to empty! This code sample would be better rewritten as follows:
<?php
$string = "A string to test.";
$var = strpos($string, "A string"); // Returns integer 0 as the position it was found.
if(isset($var) && $var === false)
{
echo 'Not found!';
}
?>
The use of the isset() means that we know the variable exists but we don’t evaluate its value yet. The second condition evaluates whether or not the variable is actually false (three equals signs compare type as well as value).
In a nutshell, the simple rules of empty() and isset() are this: use isset() if you only care whether the variable exists in the current scope, and use empty() if you care that the value has an actual value that evaluates to not empty.
This blog entry implements The Beginner Pattern.
This entry is part of an ongoing series involving the review of a code sample and it’s refactoring. For the original code sample, see here.
The topics discussed in this entry may be fairly advanced. Please feel free to ask questions, and discuss best practices.
If you’ve been following this series from the beginning, take a moment to look at the original code sample and compare it with where we are now. We’ve come a long way!
There is one last area that I want to address, and this has everything to do with object-oriented principles and code reusability. For those who are familiar with OO programming, they realize that the use of classes does not make something object oriented by nature. In this final part of the series, we’ll move one step closer to being object-oriented, by introducing the concepts of request and response objects.
At the moment, our object takes arguments like most functions do. This has some limitations. The first limitation is that the object must be aware: that is, it must have an understanding of the request it is being passed, and the response that it is getting from Twitter, as well as the response it will give back to our application. This means that in the event that something ever changes about the way that response is organized, we have to change this code, explicitly. I would like to avoid that.
The first way we will change this need to be aware is by altering the way the request comes into the object. We will do this by passing in a request object.
/** * Constructor. * @param [HTTPRequest] $http The HTTP request object. * @param IRequest $request The request object. * @param string $username The username of the Twitter user * @param string $password The password of the Twitter user * @return void */ public function __construct [HTTPRequest] [http,] IRequest $request, $username, $password) { $this->user = $username; $this->pass = $password; $this->http = [http;] $this->request = $request; }
The request object will contain the message we plan to pass to the Twitter API, and could contain other information. It could potentially contain the username and password, but since the point of this object is to log in from the backend, I’m going to assume that the credentials are coming from a configuration file and not from the environment or the end user.
Why would we want to use a request object? The reason is that the API will be the same for the request object, regardless of the structure of the $_POST/$_GET/$_REQUEST superglobal arrays. Additionally, you can add filtering capabilities to your request object, and there are a number of robust features you can add (for example, automatically turning JSON into a request object for use with web services). Your object can be totally ignorant of the structure of the request, and still access it, through the settings provided in the request object.
The second thing that we want to do is apply this principle to our response. Our tweet() method returns true or false, but it returns true or false in error conditions as well as in success conditions. Additionally, there’s no ability for us to utilize the data that Twitter gives us back during a success or failure condition. This is sub-optimal.
Let’s assume that we have a response object capable of returning a response, and it can take a string, an exception, html or JSON (depending on which arguments you pass it). With a little modification, our code now looks like this:
/**
* A public method to post a tweet.
* @return object An object implementing IResponse.
*/
public function tweet()
{
$message = $this->request->get('message');
// Sanity checks. These are error conditions.
try {
$chars = strlen($message);
if(empty($chars) || $chars > 140)
{
throw new Exception('No message provided is not valid for tweeting due to a length violation. Max length: 140 chars. Min length: 1 char. Given length: ' . $chars);
}
if ($this->already_tweeted($message))
{
throw new Exception('This message has already been sent to Twitter');
}
} catch (Exception $e) {
$response = new ExceptionResponse($e);
return $response;
}
// STATUS UPDATE ON TWITTER
$this-> [http->setHost(] this->baseHost . "statuses/update.json?status=" . $message);
$this-> [http->setCredentials(] this->user, $this->pass);
$this-> [http->setOption(HTTPRequest::RETURN_TRANSFER,] true);
$this-> [http->setOption(HTTPRequest::POST,] true);
$TwitterResponse = $this-> [http->execute();] $rawRespoTruncated by Planet PHP, read more at the original (another 6822 bytes)
Recently, I’ve been getting more and more into community-supplied code, since it’s generally been getting better. Namely, I’ve been exploring the PEAR offerings, and seeing what pieces I can integrate into my personal framework for development. One of these packages is the package called Log, which allows for easy logging of application events.
So imagine my surprise and sadness to learn that the package still supports PHP 4.
The truth is that there are lots and lots of hosts that are still on PHP 4. There are still lots and lots of lines of PHP code that make use of PHP4-level features (see Wordpress and Drupal for examples). My complaint is that PHP 4 has been officially unsupported for the last year now, and PHP 5 has been out for over five years. Five years! That’s a long time to wait to take advantage of new features in the language.
Sadly, developing for PHP 4 backwards compatibility is something that companies and individuals are still doing. Wordpress released a new Widget API in version 2.8 that relies on the old-style PHP 4 constructor. Apparently, for Wordpress and many other developers, wide adoption is more important than language improvements.
In the case of Log, supporting PHP 4 is a reality and a requirement imposed by PEAR. I certainly do not discount the efforts of package maintainers like Jon Parise, Jan Schneider and Chuck Hagenuch, who are bound by PEAR’s rules that prohibit the breaking of backwards compatibility. These individuals are working hard to support a package that is used by hundreds of developers, and is depended on for the proper functioning of 21 individual PEAR packages. I’ve been in the process of working through a PHP 5 version of the Log package, both for my own use and perhaps contribution to the community, because I think that having the ability to use statics, abstracts, interfaces and typehinting are important enough to merit the work.
Still, I look forward to the day when PHP 4 finally does go away forever, leaving us with a much better code base and happier developers.
One of the decisions that has to be made each time an application is written for distribution is how best to set up the configuration files. There are a number of different approaches taken to this: some opt to use the define() function and define constants, while others use large arrays. The purpose of this post is to discuss a couple of configuration options that sometimes are overlooked by developers (including myself until last year).
Class Constants
One of the biggest struggles in OOP programming is how to go about setting up the default values and configuration values. For example, if you have a database class, how do you define the connection parameters?
One solution that I like is to use class constants. Class constants are just like constants you define with the define() construct; however, they adhere to more object-oriented patterns, and help keep you from having to draw the constants from outside the class. There are some drawbacks, though: while constants can be defined at runtime with a variable, you have to define class constants in the code base itself. That is, you have to use a string or an outside constant to set the constant’s value.
Let’s take a look at how this works:
Example 1: A typical database constructor
// Assuming a MySQL connection and
// not addressing database interoperability...
public function __construct($username = 'user', $password = 'pass', $host = 'host', $database = 'mydb')
{
$dsn = 'mysql:host=' . $host . ';dbname=' . $database;
try {
$this->dbh = new PDO($dsn, $username, $password);
}
catch (PDOException $e) {
throw new DBException('Unable to connect to PDO: ' . $e->getMessage);
}
}
Example 2: A database constructor using constants
// Assuming a MySQL connection and
// not addressing database interoperability...
public function __construct($username = Config::DBUSER, $password = Config::DBPASS, $host = Config::DBHOST, $database = Config::DATABASE)
{
$dsn = 'mysql:host=' . $host . ';dbname=' . $database;
try {
$this->dbh = new PDO($dsn, $username, $password);
}
catch (PDOException $e) {
throw new DBException('Unable to connect to PDO: ' . $e->getMessage);
}
}
It might look initially as though I’ve simply replaced the default arguments with another set of default arguments. But look again. I’ve done more than that. While the arguments would likely be stored in the Config class and be identical to the ones we were giving the constructor before, by placing all the arguments in a class (and presume that we place a large number of other arguments in that class as well regarding our configuration), we’ve simplified distribution. Imagine if you distributed your code to six different servers, all with different hosts. That would be insane to change the code every time!
The use of class constants also allows us to access values in another class through PHP’s object model, rather than relying on the “hacks” in PHP (like the fact that constants defined with define() are in all scopes). It also means that we don’t have to pass in the connection arguments each time we connect – unless we want to connect to a non-standard database.
This model is fairly well used by lots of projects, including Zend Framework and Propel, but often trips up first time developers of object-oriented code.
INI Files
Every professional PHP developer is familiar with the php.ini syntax, and has probably edited their php.ini file on occasion. What many developers don’t know (and I didn’t even know until DrupalCon 2009) is that PHP includes a function for parsing a configuration file in the INI style. The function is called parse_ini_file() and it parses an INI file into an array.
Example 3: Sample INI file
; Application configuration script. [database] host = localhost database = mydb username = user password = password [paths] uploaddir = /var/www/uploads includedir = /var/www/includes publicdir = /var/www/public
By using the parse_ini_file() function, we get an array that looks like this (output using print_r()):
Example 2: Parsed INI file as array in PHP
Array
(
[database] => Array
(
[host] => localTruncated by Planet PHP, read more at the original (another 2203 bytes)
This entry is part of an ongoing series involving the review of a code sample and it’s refactoring. For the original code sample, see here.
Now that we’ve worked out the abstraction issues and the logic questions, we should take a moment to focus our attention on a few of the issues relating to the architecture and testability of the class we’ve worked out.
A couple of big architecture issues raise their heads early on. The first one is this set of code:
<?php
// class_Twitter.php - NPC TWITTER AUTO-FEED
error_reporting(E_ALL);
// DO NOT RUN THIS SCRIPT STANDALONE (HAS PASSWORD)
if (count(get_included_files()) < 2) {
header("HTTP/1.1 301 Moved Permanently"); header("Location: /"); exit;
}
This raises a number of architectural issues that we should address. My recommendation is that we remove this code altogether. The first line, about error reporting, really should be set at the application level, rather than the script level. As for the redirect, this is intended to prevent the script from being called directly. However, it’s a good idea to place classes like this outside the document root anyway, meaning that they would never be called directly. When including this file, there’s the potential that you’ll not have included enough files and you’ll inadvertently redirect your user. So let’s drop these lines altogether. This will improve reusability and improve the architecture.
There’s another line of code that is particularly troubling:
$this-> [http->setHost(] this->baseHost . "statuses/update.json?status=".urlencode(stripslashes(urldecode($message))));
It may not look obvious at first, but this code relies on the assumption that the magic_quotes_gpc directive is set to on.
For those who don’t know, magic_quotes_gpc automatically adds slashes to all GET, POST and COOKIE variables that come into an application. This deprecated feature represents a poor programming practice, and its use is discouraged. Though it remains on by default in PHP installations, it should be turned off altogether if at all possible. It’s slow, and potentially dangerous. Additionally, it will be removed in future versions, meaning that code relying upon magic_quotes_gpc will break in the future.
Now, oftentimes programmers don’t have direct access to the magic_quotes_gpc directive in php.ini, but it can be set on the htaccess level, if your hosting provider allows you to access PHP variables in this way.
I’m going to make the assumption that we’ve disabled magic_quotes_gpc, as is recommended by the PHP manual.
Another problem that we need to address is the fact that this object gets thrown away after it’s been used. We have the property $done, which gets set when a tweet is successful. Unless there’s a good reason for it, objects should never be designed this way. We’ll remove this code in the final draft (see below).
Moving on to testing, again we have to ask ourselves about abstraction. Where testing is concerned, it would be very easy to abstract the testing out of this object. Making use of things like the HTTP_Request2_Adapter_Mock class will help us to “test” the Twitter interface without actually posting a tweet. Unit testing software like PHPUnit will give us reports on code coverage.
If we opt to leave the test code in the object itself, it should be abstracted into its own method. But I don’t recommend this course at all, because testing should be conducted at the application level. Moving testing out of the class also allows us to remove references to the $done property, and remove the $you property as well as the test() method, the $test property, and a large bit of code from the tweet() method. All of this code will then not be loaded on each request of our class. Not a bad improvement.
Something else that we’ll do in order to improve testability is we will inject (rather than create) the [HTTPRequest] object into the class. This allows us to both inject an object that is a mock object, and it also allows us greater control over the environment when we do our testing.
It was pointed out by the blog PHP In Action that one of the things that we should have done first is write unit tests. I think that t
Truncated by Planet PHP, read more at the original (another 4927 bytes)
Last week, Cal Evans retweeted James McGovern, who originated this tweet:
I’m not a fan of catchy sayings and one-line wonders, but this tweet got me thinking. How many companies, especially in the economic world we’re in, think about training as something that they don’t want to do, or an investment they can’t afford?
There are any number of reasons why companies don’t train their employees. Perhaps they made a financial decision that the business is doing fine, and such an investment doesn’t make sense. Perhaps they see their employees as interchangeable parts that can be replaced, and investing in a particular employee would represent a massive cost that might never be repaid by their productivity. Perhaps they figure that the employee’s job isn’t that important to the company to merit any sort of training.
But failing to train an employee, any employee, might be the biggest mistake a company can make. Employees are partially responsible for the success of the company. The most successful companies listen to their employees, and take their suggestions and ideas to heart. The least successful companies listen only to the executive team, as though that team is the only place where good ideas originate. But employees that don’t have the appropriate level of training, understanding and intuition will never generate the ideas to keep a business growing.
In technology this is especially true, because technology changes so frequently. Training employees in new technologies will help keep them in tune with the community, the development efforts, and the products on the market. In turn, this will make their ideas better, and their suggestions more business ready. And the dividends will pay.
As for whether or not those “investments” you made will walk out the door, remember this: Microsoft spends millions every year on its Research and Development operation just to keep pace. Most large companies do the same, because they realized a long time ago that while sometimes the money might be wasted, having the next big idea could be a huge game changer in any industry. They’re not afraid to spend money in search of new business opportunities, and neither should you. Training is a key component of that, and without it, new ideas won’t flow into your business the way you need to keep on top of your market.
This entry is part of an ongoing series involving the review of a code sample and it’s refactoring. For the original code sample, see here.
So far, we’ve done quite a bit of work on our Twitter class, making it better. There’s still work to be done, though, especially improving the logic.
The Twitter class we have now has a number of logical flaws in it that we need to address. Additionally, there are some logical flaws that we started with that I want to highlight, even though we’ve already fixed them. Let’s get started with those.
When we started, we were setting the host [www.twitter.com] in the constructor, and then concatenating it in the tweet method. But we were setting it as an object-level property. Let’s look at that code:
$this->host .= "statuses/update.xml?status=".urlencode( stripslashes( urldecode($message )));
This is a bad practice for one reason: it makes the object unusable a second time! If we ever use the object again, we’ll end up concatenating the message onto the end of the last message and getting a 404 error. Not good.
We’ve fixed that by setting the host inside of the HTTP object. Good.
I also have to ask: is an HTTP status code of 0 ever acceptable as a result? According to the author of this class, Twitter sometimes returned a status code of 0 but was successful in sending the tweet. However, Twitter ALSO returns a JSON or XML response, which is much more effective for determining the success (or failure) of a particular tweet. I’m going to assume that getting the response gives us pure JSON, and using the Twitter API documentation to make some assumptions. Let’s take a look at the bottom of the tweet() method:
// STATUS UPDATE ON TWITTER
$this-> [http->setHost(] this->baseHost . "statuses/update.json?status=".urlencode(stripslashes(urldecode($message)));
$this-> [http->setCredentials(] this->user, $this->pass);
$this-> [http->setOption(HTTPRequest::RETURN_TRANSFER,] true);
$this-> [http->setOption(HTTPRequest::POST,] true);
$response = $this-> [http->execute();] $array = json_decode($response->getRaw(), true);
if(isset($array['id']) AND $array['id'] > 0)
{
return TRUE;
}
return FALSE;
That’s much better. Now, we’re evaluating whether or not the tweet has an ID that is greater than zero, meaning that if anything else is returned, we know the tweet was unsuccessful.
We still have some logic issues to sort out, though. People who use Twitter know that occasionally it fails. If it fails while we are attempting to tweet, we could run into some problems because the already_tweeted() method inserts our tweet into the database BEFORE the tweet is actually sent. This is a huge logical problem.
Thankfully it’s an easy one to solve, by breaking the already_tweeted function into two functions: checkTweet(), and logTweet().
/**
* Check to see if this tweet has already been posted.
* @param string $message The tweet
* @return bool
*/
protected function checkTweet($message)
{
$hash = md5( date('Y-m-d') . $message );
$c = new Criteria();
$c->add(TwitterLogPeer::THASH, $hash);
$count = TwitterLogPeer::doCount($c);
if($count > 0)
{
return TRUE;
}
return FALSE;
}
/**
* Log the tweet if it has already been tweeted.
* @param string $message The tweet
* @return bool
*/
protected function logTweet($message)
{
$hash = md5( date('Y-m-d') . $message );
try {
$logTweet = new TwitterLog();
$logTweet->setDate(time()); // This will be turned into a DateTime object.
$logTweet->setHash($hash);
$logTweet->setTweet($message);
$logTweet->save();
} catch (Exception $e) {
throw new Exception('There was an error in saving the item to the database.');
}
return TRUE;
}
These logical fixes will improve the overall logic of the class. There are still a couple of considerations that the developer might want to take into account. The biggest consideration is that the process of verifying that we’re not sending the same tweet will work well, except over midnight server time. This is a small consideration, however, and you might consider whether fixing this presents a significant value.
Also worth considering is whether or not the object should even verify
Truncated by Planet PHP, read more at the original (another 5174 bytes)
This entry is part of an ongoing series involving the review of a code sample and it’s refactoring. For the original code sample, see here.
Editor’s Note: The response of the community to this series has been great, and I’ve been given a large number of suggestions. I’ve incorporated some of those suggestions into the code and into this article. Thanks to Jeff Carouth, Greg Beaver and Daniel O’Connor for their help and suggestions.
This entry will focus on our use of the database, and specifically on the already_tweeted() method. This method has a number of problems, and while we’re focusing on the implementation of the database, it’s important to note that we will also need to address some of the logic (which will be the next part of the series).
In our last entry, we focused on abstracting the HTTP request out to a seperate class. Lots of people wrote comments with suggestions of HTTP handlers, including pecl_http">[http">pecl_http] , the PEAR HTTP class, HTTP_Request2 and the PEAR Log class for logging. These are all great suggestions, and all will help abstract out the class without causing us to have to write our own implementation of common problems (the Not Invented Here (N-I-H) syndrome).
In focusing on the already_tweeted method, one thing becomes immediately apparent: it is private. This suggestion, provided by Greg Beaver relates to our first discussion of coding standards and we will change the class to a protected class for extendability later on.
Let’s take a look at the function and dive in to some of the mistakes in it:
/**
* Check to see if this tweet has already been posted and add it to the DB
* if it has.
* @param string $message The tweet
* @return bool
*/
protected function already_tweeted($message)
{
$text = mysql_real_escape_string(trim($message));
$date = date('Y-m-d');
$code = md5($text . $date);
$sql = "SELECT id FROM twitterLog WHERE thash = \"$code\" ORDER BY id DESC LIMIT 1";
if (!$res = mysql_query($sql))
{
die(mysql_error());
}
$num = mysql_num_rows($res);
if ($num)
{
return TRUE;
}
$sql = "INSERT INTO twitterLog (tdate, thash, tweet) VALUES ( \"$date\", \"$code\", \"$text \" )";
if (!$res = mysql_query($sql))
{
die(mysql_error());
}
return FALSE;
}
Let’s talk first about assumptions. This function assumes that a database connection has already been established for us. And it also assumes that we’re going to use MySQL. What if we want to use another database type (like Postgres, SQLite, or MSSQL)? These are both assumptions that I’d prefer not to make.
We have a few choices with regards to database interactions. We could refactor this using PDO, which would allow us many benefits: PDO is object-oriented, supports multiple databases with the same function calls and structure (meaning that changing database backends wouldn’t break our code), and is relatively easy to learn and understand. We could certainly refactor this using PDO.
The problem is, however, that using PDO doesn’t abstract out the development of our SQL queries. We still have raw SQL in our class. This is less than optimal.
Instead, I want to make use of an ORM (Object Relational Mapping) framework. My preference is Propel, as I’ve worked with it before, but you can use one of the many PHP-based ORMs (like Doctrine). This provides many benefits:
This isn’t to say that utilizing an ORM
Truncated by Planet PHP, read more at the original (another 2098 bytes)
For those who like the newest in development tools, Apple has surely delivered with the Snow Leopard operating system upgrade.
Apple has compiled PHP 5.3, including many of the extensions they forgot in the PHP 5.2.x version included with Leopard. This includes GD, and the MySQL Native Driver (mysqlnd) that is available in PHP 5.3. They’ve also compiled Subversion 1.6.2, and Apache 2 is included as well (2.2.11).
It does not appear that MySQL is included with Snow Leopard, though. PHP also does not have a configuration file, resulting in a warning being emitted by PHP about the timezone not being set. These are relatively easy problems to correct, though I have not tested the full MAMP stack for compatibility with products such as Wordpress or web2project.
Here’s a copy of the Apple ./configure line for PHP:
Brand-new PHP developers have drilled into their heads the concept of Filter Input, Escape Output (FIEO). This concept essentially insists that all user-provided content be filtered or escaped, without exception. With the delivery of PHP 5.2.0, this got a lot easier, because PHP included, by default, the Filter library.
Before the Filter library, doing something such as validating an email address often required an ugly regular expression along the lines of this:
<?php
$email = 'firstname.lastname@aaa.bbb.com';
$regexp = "/^[^0-9][A-z0-9_]+([.][A-z0-9_]+)*[@][A-z0-9_]+([.][A-z0-9_]+)*[.][A-z]{2,4}$/";
if (preg_match($regexp, $email)) {
echo "Email address is valid.";
} else {
echo "Email address is invalid";
}
?>
The filtering protocol makes this easy for us by providing a built-in filter that we can use to validate an email address:
<?php
$email = 'firstname.lastname@aaa.bbb.com';
if(filter_var($email, FILTER_VALIDATE_EMAIL) !== false)
{
echo "Email address is valid."
}
else
{
echo "Email address is invalid.";
}
This way has a number of benefits: first, it makes the code more readable. You don’t have to know regular expressions to see what it is we’re validating. Second it reduces the likelihood of errors. Since the same filter is applied each and every time, and has had the benefit of being reviewed by other core developers, you can feel confident that PHP has a working validation function.
There are a number of other validation and filtering functions you can use, including checking to make sure something is a string, applying addslashes(), checking for an integer or a boolean or the like. These filtering functions will always be faster than any custom function you might write (being part of the PHP C code), and provide a fantastic amount of benefit to the filtering and validating of data. Check them out today!
This entry is part of an ongoing series involving the review of a code sample and it’s refactoring. For the original code sample, see here.
There are a number of fundamental concepts in object-oriented design that we should take notice of. One of these concepts is abstraction. This is what we will focus on today and in the next entry.
This article will focus on the constructor method. There are a couple of problems, namely that the constructor itself does a lot of actual work. Also, we have the cURL setup done in the constructor. This object is a Twitter object, not a cURL object; this means that we should decouple the cURL functionality and abstract it into a separate object of its own. This not only will make our object more true to it’s functionality as a Twitter object, but will allow for greater reuse (since we’ll be able to use the cURL object for more).
There’s some debate about whether or not you ought to write things like wrappers for native PHP functionality. One thing that is missed in the “don’t write a wrapper for cURL!” argument is that a good wrapper for HTTP requests shouldn’t be limited to cURL. In fact, it should make use of fopen(), file_get_contents(), etc. in the case that cURL doesn’t exist. A good object would test for this, and change its behavior based on preset rules. (strategy pattern, anyone?)
There are two things that we need to: first, we need to abstract the cURL tools out of the class. Second, we need to further encapsulate the class. The Constructor should not do the work of cURL setup; this should be handled elsewhere. We will do this by writing a new class (which I will not show) for HTTP requests, and then we will handle the HTTP request in the existing functions. Our constructor will now look like this:
/**
* Constructor.
* @param string $username The username of the Twitter user
* @param string $password The password of the Twitter user
* @param string $email Optional email address for testing
* @return void
*/
public function __construct($username, $password, $email = null)
{
$this->user = $username;
$this->pass = $password;
$this->you = $email;
$this->host = "http://twitter.com/";
$this->done = FALSE; // DEFAULT - NOT ALREADY TWEETED
$this->test = FALSE; // DEFAULT - THIS IS LIVE, NOT A TEST
$this->http = new [HTTPRequest();] }
We will assume that the [HTTPRequest] class, which is not shown, will do error-handling, check to see the appropriate methods are available, and handle configuration items like timeout, host, and other settings.
The next entry in the series will focus on abstracting the database layer, which will be roughly similar to entry, but will contain more information about the reasons for abstraction and the process of abstracting the code.
Our class now looks as follows:
<?php
// class_Twitter.php - NPC TWITTER AUTO-FEED
error_reporting(E_ALL);
// DO NOT RUN THIS SCRIPT STANDALONE (HAS PASSWORD)
if (count(get_included_files()) < 2) {
header("HTTP/1.1 301 Moved Permanently"); header("Location: /"); exit;
}
/**
* Twitter Class
* @author Anonymous
*
*/
class Twitter
{
/**
* @var [HTTPRequest] HTTPRequest Object.
*/
protected [http;]
/**
* @var string Email address for administrator/tester
*/
protected $you;
/**
* @var string The username for the Twitter account.
*/
protected $user;
/**
* @var string The password for the Twitter account.
*/
protected $pass;
/**
* @var bool The flag for test mode
*/
protected $test;
/**
* @var bool Flag for whether or not this tweet has been sent today
*/
protected $done;
/**
* Constructor.
* @param string $username The username of the Twitter user
* @param string $password The password of the Twitter user
* @param string $email Optional email address for testing
* @return void
*/
public function __construct($username, $password, $email = null)
{
$this->user = $username;
$this->pass = $password;
$this->you = $email;
$this->baseHost = "http://twitter.com/";
$this->done = FALSE; // DEFAULT - NOT ALREADY TWEETED
$this->test = FALSE; // DEFAULT - THIS IS LIVE, NOT A TEST
$this->http = new [HTTPRequest();] Truncated by Planet PHP, read more at the original (another 2751 bytes)
Anyone who reads my frequent pleas to involve yourself in the community knows that I’m a big fan of community development of open source projects. PHP is one of the world’s largest open source projects. And if I haven’t convinced you yet that you need to contribute, perhaps this will help encourage you.
Professionals (those who make their living from open source) owe it to the open source communities they utilize to give back. Anything less is akin to stealing – be it time, talent or treasure – from the community that keeps them in business.
That may sound harsh but consider: people who write open source software generally aren’t paid to do it. There are a few exceptions (some people are paid to write Wordpress or the Zend Framework, for example). But the core of PHP is contributed by people who are solely interested in making PHP better for their professional endeavors. Not contributing to their efforts is like showing up at a potluck without bringing a dish. Not cool.
Contribution to the community doesn’t have to be in the form of writing a patch or a new extension for PHP, or releasing some massive open source project. No, it can be as simple as contributing to the documentation, or submitting a bug report. We all come across bugs – if we don’t report them, no one can fix them! Each of us has something to contribute, if only we choose to do so. And when we contribute, we owe it to the community to do so in a full, complete way – submitting usable bug reports or documentation, for example.
The community exists because people choose to contribute. Giving back isn’t a matter of charity, but a matter of the survival of the product or project you use every single day. It’s not really optional, but it is easy. I’m personally wary of those who opt not to give back, if only because it makes me question how genuinely they appreciate and understand the product they’re using.
So file a bug report or write some documentation today. You’ll feel good, the community will benefit, and the whole ecosystem gets healthier. Remember: PHP depends on people just like you.
Someone asked me a few days ago what the best plugins for Wordpress are. That’s not a question I can answer definitively; however, I can answer the question about what plugins I prefer and use every day. Here’s a list of the plugins I use.
Akismet This is a must-have plugin for spam protection. Written by the Automattic team (the same people responsible for Wordpress), it helps prevent you from ever having to deal with spam comments.
All in One SEO Pack This powerful plugin rewrites various aspects of your site to be more SEO-friendly (like the title bar). It also lets you add meta information for search engines. This one is updated frequently so you may run into update fatigue, but I personally think this is also an important plugin.
Exclude Pages If you want to exclude certain pages from your navigation, this plugin lets you do that with a simple checkbox. Why this isn’t part of the Wordpress core I don’t know; but this plugin will remove pages (and their children) from the nav menu, making it easy to create linked pages that aren’t part of the navigation.
Organize Series For the creation of a series, you need this plugin. This plugin automatically links all posts in a series together, and creates a widget you can use to show the series on your theme. This is a must-have for anyone who writes multiple-part blog entries.
Sociable One of the best ways to spread something around is to let people who like it talk about it. Sociable is a great widget for this. It shows up at the bottom of each entry, and you can customize what’s available by default for your users to use. This is a must-have for sharing your content.
Add Post Footer If you’ve ever had your content stolen in its entirety, you’ll want this plugin. This plugin allows you to add a footer to each and every entry, which can be invaluable for showing those thieves what-for. The interface is lacking, but the functionality is there, so give it a shot.
Subscribe To Comments For discussions this is an incredibly important plugin. When people write comments they can click a checkbox to be notified of updates to that set of comments. This facilitates discussion, and is a plugin that is quite heavily used on my blog.
Syntax Highlighter Plus Many of the code samples I use would be impossible without this versatile plugin. This is perhaps the best plugin I’ve ever found. It allows you to write in tons of languages, and give syntax-highlighted examples of all of them. Definitely recommended for technical bloggers.
WP Twitip ID I like to know when my commenters are on Twitter, so that I can follow and interact with them. This plugin allows people to leave their Twitter ID as a part of their comment. This takes some formatting on your part in the comments, but overall I think this is a great plugin to facilitate and improve the discussion.
Yet Another Related Posts Plugin This plugin creates the “related posts” information at the bottom of each post. Not a must-have, but a fun and useful tool. It allows you to show readers what they might also be interested in. You can tune the logic to make the results more relevant, and the plugin does a great job without this of determining related posts on its own.
That’s my list of plugins that I use. Tell me in the comments what your favorite Wordpress plugins are.
If you need a sample of the code, please visit here.
One of the first things I look for when I check out code is how is the code organized? Is it laid out well? Is it coded to a particular standard?
In our code sample, the first thing we should address is how does the code look. There are a number of suggestions I would make immediately. Let’s dive in.
There are no DocBlocks or clear coding standards.
No clear coding standard jumps out at you right away when you read this code. There’s a lack of consistency, but beyond that, code completion is hindered by a lack of standards. Also, there’s no DocBlocks, which would help improve the documentation of the code.
There are lots of coding standards out there: PEAR, Wordpress, Drupal and Zend Framework all have a coding standard in place that you can adopt in your own code. I highly recommend it.
The first thing I might do is add some DocBlocks. It will help us understand the code better.
Examining The Properties
The next thing I want to examine is the list of properties. Each property is private, with the exclusion of the public $res property. The $res property is actually unnecessary (it isn’t called by more than one method). From a style standpoint, most private properties are prefixed with an underscore (”_”) to demonstrate that they are, in fact, private. Let’s go ahead and make this code change:
<?php
class Twitter
{
/**
* @var resource Curl connection
*/
private $_ch;
/**
* @_var string Email address for administrator/tester
*/
private $_you;
/**
* @var string The username for the Twitter account.
*/
private $_user;
/**
* @var string The password for the Twitter account.
*/
private $_pass;
/**
* @var bool The flag for test mode
*/
private $_test;
/**
* @var string The URL for the API call
*/
private $_host;
/**
* @var bool Flag for whether or not this tweet has been sent today
*/
private $_done;
/**
* @var mixed Response of the cURL queries
*/
public $res;
We will, of course, need to make that change throughout the code.
But there’s something else about the properties that we need to examine. Because the properties are private, this class cannot be extended. This might be an intentional architectural choice; however, it is more likely that the author just didn’t consider the possibility of extension.
I prefer t make my properties protected. I also prefer to make ALL of my properties protected. I prefer to access properties through methods, rather than directly from outside the object. Also, since the $res property is unnecessary, we’ll go ahead and drop it. Our properties list now looks like this:
class Twitter
{
/**
* @var resource Curl connection
*/
protected $ch;
/**
* @var string Email address for administrator/tester
*/
protected $you;
/**
* @var string The username for the Twitter account.
*/
protected $user;
/**
* @var string The password for the Twitter account.
*/
protected $pass;
/**
* @var bool The flag for test mode
*/
protected $test;
/**
* @var string The URL for the API call
*/
protected $host;
/**
* @var bool Flag for whether or not this tweet has been sent today
*/
protected $done;
Conditionals
One thing I noticed about this code as well is that it has a number of conditionals that are not wrapped in curly braces. While this is syntactically correct, it creates some issues for debugging later, and I generally prefer not to rely on it.
Take this for example (from the code):
if ($this->already_tweeted($message)) $this->done = TRUE;
If we ever decide we need to debug by, say, outputting the message, and we change the code to this:
if ($this->already_tweeted($message))
var_dump($message);
$this->done = TRUE;
…then we always are marking the tweet as being sent previously. This is clearly not optimal.
Let’s go ahead and wrap all of the “i
Truncated by Planet PHP, read more at the original (another 6606 bytes)
On Wednesday, August 12th, we had a meeting of the DC PHP Developer’s Group. Keith Casey of Blue Parabola led a code review of a member-submitted sample. The review was informative, educational, and helpful. With the permission of that member, I’ve decided to write a series on the tools for reviewing code and re-factoring it.
The code sample (included below) isn’t perfect. It needs work, and we’ll be working on it over the next few articles. Along the way, we’ll talk about strategies for identifying weaknesses, candidates for refactoring, and methods for writing better code before you get to the review stage. We’ll also refactor this into something more usable, and end up with a finished product that’s better than when we started.
The knowledge gained is based off the DC PHP Developer’s Group, as well as many notes that I put together. The refactoring process requires a lot of thought, and so this series will contain six more entries, starting with the first one on Monday. But before we begin, here’s the code sample we’ll be working with:
<?php // class_Twitter.php - NPC TWITTER AUTO-FEED
error_reporting(E_ALL);
// SEND AN AUTOMATED TWEET
// USAGE EXAMPLE
// require_once('class_Twitter.php');
// $status = 'Hello World';
// $t = new Twitter;
// $t->tweet($status);
// unset($t);
// DO NOT RUN THIS SCRIPT STANDALONE (HAS PASSWORD)
if (count(get_included_files()) < 2) {
header("HTTP/1.1 301 Moved Permanently"); header("Location: /"); exit;
}
class Twitter
{
private $ch;
private $you;
private $user;
private $pass;
private $test;
private $host;
private $done; // ALREADY TWEETED?
public $res;
public function __construct()
{
$this->user = "??? TWITTER USERNAME";
$this->pass = "??? TWITTER PASSWORD";
$this->you = "??? YOUR EMAIL ADDRESS";
$this->host = "http://twitter.com/";
$this->done = FALSE; // DEFAULT - NOT ALREADY TWEETED
$this->test = FALSE; // DEFAULT - THIS IS LIVE, NOT A TEST
$this->ch = curl_init();
curl_setopt($this->ch, CURLOPT_VERBOSE, 1);
curl_setopt($this->ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($this->ch, CURLOPT_USERPWD, "$this->user:$this->pass");
curl_setopt($this->ch, CURLOPT [HTTP_VERSION,] CURL [HTTP_VERSION_1_1);] curl_setopt($this->ch, CURLOPT_POST, 1);
}
public function __destruct()
{
curl_close($this->ch);
}
// SET AN INDICATOR THAT THIS IS NOT A LIVE TWEET
public function test()
{
$this->test = TRUE;
}
// DETERMINE IF THE MESSAGE HAS ALREADY BEEN SEND
private function already_tweeted($message)
{
$text = mysql_real_escape_string(trim($message));
$date = date('Y-m-d');
$code = md5($text . $date);
$sql = "SELECT id FROM twitterLog WHERE thash = \"$code\" ORDER BY id DESC LIMIT 1";
if (!$res = mysql_query($sql)) die(mysql_error());
$num = mysql_num_rows($res);
if ($num) return TRUE;
$sql = "INSERT INTO twitterLog (tdate, thash, tweet) VALUES ( \"$date\", \"$code\", \"$text \" )";
if (!$res = mysql_query($sql)) die(mysql_error());
return FALSE;
}
// POST A MESSAGE TO TWITTER
public function tweet($message)
{
if(strlen($message) < 1) return FALSE;
if ($this->already_tweeted($message)) $this->done = TRUE;
// IF ONLY A TEST, JUST EMAIL THE INFORMATION - DO NOT TWEET
if ($this->test)
{
$msg = '';
if ($this->done) $msg .= "ALREADY DONE ";
$msg .= "TWEET: $message";
mail($this->you, 'What We Would Have Tweeted', $msg);
return TRUE;
}
// DO NOT REPEAT YOURSELF
if ($this->done) return TRUE;
// STATUS UPDATE ON TWITTER
$this->host .= "statuses/update.xml?status=".urlencode( stripslashes( urldecode($message )));
curl_setopt($this->ch, CURLOPT_URL, $this->host);
$xxx = curl_exec($this->ch);
$this->res = curl_getinfo($this->ch);
if ($this->res['http_code'] == 0) return TRUE;
if ($this->res['http_code'] == 200) return TRUE;
return FALSE;
}
}
?>
Upon request of the PHP community, Joe Stagner conducted additional tests of PHP, using both PHP 5.3 and APC. In his article he also revisits the questions raised by myself and Andi Gutmans about the fairness of his tests.
The new tests reveal that APC doesn’t do much to improve the performance of his applications. This wasn’t a significant surprise to me; after reviewing his code I realized that the script wouldn’t be called again and thus APC wouldn’t provide a benefit. And though some of his tests (MySQL, file system tests, and Postgres) are dependent on external factors (like how well MySQL performs on Linux versus Windows), I think they paint a fair picture. Joe’s tests were fair.
The only thing left to evaluate is the engines that power the core of PHP versus ASP.NET.
One thing that no one brought up is that in his tests, the “for” loop performed more poorly than the “while” and, in some cases, “do-while”. Wait, what? Check out his test results and you’ll see that in each and every test, “for” is outperformed by “while”, and in many cases, even outperformed by “do-while”.
If you ask Sara Golemon, she’ll tell you that shouldn’t be. But it is. PHP’s nature as an open-source project means that contributions can be made by lots of different people, and sometimes you’re going to have performance suffer for it. But almost always you get a better product.
For those who are upset or incensed by Joe’s test results, I offer you this challenge: make PHP better. PHP is only as good as the people who contribute to it. Those who contribute to the core are hard-working volunteers. No one gets paid to write core PHP components or extensions. It’s all volunteerism. Those who work on it now are class acts who should be rewarded. And they should be aided, too.
We can’t examine the ASP.NET core to determine why it’s faster (and I wouldn’t be surprised if there were certain shortcuts in the way Windows and IIS handles requests), but we can examine the core of PHP and find ways to boost performance. And that’s the one thing about PHP that makes it different than ASP.NET: if you find a bug, you can patch it yourself. You don’t have to wait for the next release, and you don’t have to get some Microsoft engineer to care about it. You, yourself, can fix it.
The difference in PHP is you. Go make it better.
Doctors, lawyers and engineers are required by their professions to receive certifications and follow certain ethical guidelines. These rules exist to protect those who rely on their services. These professions often have access to sensitive information, or could wreck lives if they are remiss in their responsibilities. Business schools teach ethics, and despite the lapses in those ethics throughout the private sector, there still seems to be an emphasis placed on professional conduct in the business world.
Software development is a skill, and an increasingly important one, but one that is not governed by any licensing or ethical rules. It’s a Wild West of ethical and legal conduct. Sure, organizations like the Association of Computing Machinery have put together their list of ethical standards, but these are voluntary and not binding.
As programming becomes more and more in depth and important to the society of the world, we have an obligation to defend and protect the data of our customers and their customers to the best of our abilities. We have an obligation to develop to high standards, to promptly report and repair security bugs, and to warranty our work. Surely I don’t propose a system like engineers must follow, where they certify that they completed the work they are signing off on, and accept the consequences if that work is faulty; but I also think that the vast amount of personal, credit, financial, medical, and other data that programmers manage on a daily basis comes with a responsibility level that is crucial.
There are real legal consequences, too. A quick search of Craigslist will reveal hundreds of “programmers” looking for work and offering “the best price.” But how many of them adhere to best practices? When programmers work, and deliver a product, they are promising that the product has been built to an industry standard. Making mistakes like insecure passwords, writing in security holes, or baking in bugs that undermine the system’s reliability and security only serves to expose the developer to litigation. This is not a good situation.
We need to take concrete steps to incorporating ethics in our community. Computer scientists should be trained in ethics as a part of their training, if they’re not already. Ethics should be talked about at developer groups and conferences. People should write about (and debate on) the ethical standards for the programming world. And ethical behavior should be a cornerstone of programming. Ethical programmers should be rewarded, and those who are unethical should be ostracized.
It’s no secret that we like statistics. Human beings like being able to compare two things, apples to apples. We like the ability to say “this thing is better because…” and being able to back it up with a seemingly solid fact, rather than just our opinion. We build charts to show how things stack up side-by-side, and we even build software to make those charts, which are then compared using the charts, and on we go.
But benchmarks, for all their decision-making aid, fail under the best of circumstances for one simple reason: they’re not real life.
Never more is this true than in Joe Stagner’s blog post on whether Windows or Linux, and PHP or ASP was faster. It should really come as no surprise that a Microsoft employee would write an article that would conclude that Windows was faster, and that ASP was faster (I’m pretty sure if he’d found the other way and published it that Ballmer would have canned him).
Benchmarks come loaded with all sorts of problems. It doesn’t matter if it’s Microsoft doing them or Apple doing them; they don’t mimic real-world conditions, and any number of factors affect how the benchmarks are rendered (e.g. other processes running, configuration, installed modules, CLI or web, how the interpreter reads the code, etc).
Some of the problems I found with his benchmarks: there’s no discussion of configuration. Did he use the defaults? What extensions did he install alongside the core installation? The code he used isn’t published. Update: It appears that he did include his code, though I don’t know when the links were added. They were not in the original draft of the article. How was the server configured? How does the engine render each loop? ASP.NET is closed-source; that’s to say, there’s no way for us to know if the interpreter sees an empty loop or function and decides to bail rather than actually execute it. What version of PHP are you using? Why isn’t there a Ubuntu test for PHP 5.3 (I actually know the answer to that question)?
I don’t think that Mr. Stagner is trying to insult anyone or assert that ASP is 2900% faster than PHP at every task. I do think that the benchmarks provided show that ASP is faster at the given tasks and that’s it.
So, if you need to iterate over 20 million empty for loops, ASP.NET is your language.
Update: After a conversation with Cal Evans and a read about ASP.NET it appears that ASP.NET comes with a built-in byte code cache that cannot be diabled. However, Mr. Stagner conducted his tests WITHOUT APC installed on PHP. While APC is not part of the PHP core (and bytecode caching appears to be part of the ASP.NET core), PHP without APC is a poor configuration choice.
Imagine that you’ve just been offered a brand new job. That’s fantastic! Now, make sure you do the math and find out what it will cost you.
“Cost me?” you ask. Yes. Accepting a new job is exciting and often beneficial, but there are costs associated with it that you must consider before you sign the offer and tell off your present employer. Some of these costs are obvious, but others are less obvious. These costs are rarely fully explored, because we have a tendency to see the grass as greener someplace else, and if the offer we receive is substantially better than our current one, we are more inclined to accept it. But not counting the costs would be a huge mistake.
Here are some of the costs you should consider when deciding whether you should accept a new position:
Can You Afford No Salary For A While?
Finding a new job before you leave the old one is the preferred strategy for changing positions, but doesn’t necessarily mean your paycheck will be uninterrupted. Sometimes you get lucky, but most of the time there’s some lag time. That lag time might mean going without a paycheck for up to two months.
Make sure that you can afford to do this without adding debt or defaulting on existing obligations. If you’ve built an emergency fund, this may be the sort of thing it calls for; you might consider also negotiating a signing bonus with your new company that will cover some of the lost income. Consider timing as well: you may be able to work with your new employer to time your departure close to the beginning of their new pay period and the end of your current company’s pay period to help ease the transition.
For those who think asking a potential new employer about their pay schedule, I’ve had this discussion with my friends before: they think it’s rude to ask your new employer about their pay schedule. I disagree – I think it’s a matter of financial security and prudence. As you’re considering whether to accept a position, understanding how it will affect your budget is a primary key factor. You should ask – but only after they’ve issued you a signed, formal offer.
Consider Required Repayments To The Previous Employer
Many employers have great benefits, like tuition reimbursement and continued training. Some employers even allow you to borrow against your vacation time before you’ve accrued it. Many companies offer a Section 125 plan, which allows you to pay for medical benefits over the course of the year (but makes the money available upfront). I encourage use of these benefits, but many of them come with attachments or agreements to repay them if you depart the company within a certain amount of time. Be very careful to understand these agreements and consider any that you may be under before you resign.
It can be expensive to get a large bill from your employer for benefits that you used but now have to repay, and this may affect your decision to leave. Of course, you can’t very well ask human resources for this information without drawing suspicion, so it is vitally important that you keep track of every piece of paperwork you sign for a company before you need it.
Consider The Effects On Retirement Savings
When you leave a company chances are good that you’ll either need to roll over your 401(k) account or continue paying for it out of your own pocket. The fees that the company paid on your behalf will stop, and you’ll become responsible for them. Depending on the amount in your account you may not be allowed to keep it where it is; check with your provider to see.
There is another hidden danger in leaving a company: the matching funds the company puts in along with your contribution may not vest for a certain period of time. So, if the company matches you dollar-for-dollar and you’ve accrued $10,000 in your 401(k), that may drop to $5,000 if half of those funds are not yet vested.
Be sure and also consider the tax implications, especially if you have them distribute any portion of your 401(k) money. Besides paying standard taxes, if you’re below the retirement age, you’ll pay a hefty penalty (typically around 10%) to the IRS. You do have options, though: you may be able to roll over your 401(k) to the new company, or roll it into a low-cost IRA without tax liabilities. Check with your accountant or tax professional.
Be Sure You Understand Insurance Implications
Something I always want to account for is insurance – health, life, dental, vision. You’ll want to consider a number of things when switching employers.
First, you want to know when their coverage kicks in, and when your existing coverage will expire. Obviously you want to avoid as much of a gap as possible, and most emplo
Truncated by Planet PHP, read more at the original (another 4467 bytes)
When the next version of Ubuntu is released on October 29th, PHP developers won’t be able to upgrade to PHP 5.3 through the included package management tools.
A meeting of the development team on July 30th nixed the inclusion of PHP 5.3 from inclusion in Karmic, the next iteration of Ubuntu for the desktop and the server. According to meeting minutes, there is concern amongst the Ubuntu security team that failure to include the suhosin patch in the PHP release would be a feature regression. Instead, the release will be referred to PPA until more testing can be completed.
It is unlikely that Ubuntu will see official PHP 5.3 support until April, 2010, based on the history of Ubuntu release cycles. However, you can still compile PHP 5.3 on Ubuntu yourself.
The Alternative PHP Cache (APC) is a tool that offers a massive performance gain to almost any PHP application simply by turning it on. This extension to PHP provides both opcode caching and user caching, placing files and data into memory for fast retreival, and, if used correctly, eliminating some of the bottlenecks of the file server or database.
Turning APC on is a great way to get a performance boost, but there are ways to help improve APC’s performance.
The very first thing that APC does when it’s turned on is begin caching the opcode output from all of your PHP files. Essentially, it takes the compiler’s output, stores it in memory, and then retrieves it later on if that file is needed. Also by default, APC checks the file on the file system to determine if it has changed since the last time APC cached the opcodes.
When APC does this, it executes a file-system-level stat() call, which in the scheme of the calls performed is expensive. This call basically helps APC determine if it should execute the opcodes that you have stored in memory, or if it should read and compile the file. This behavior is on by default but can be disabled.
To disable it, you must add the following line to your php.ini file:
apc.stat=FALSEThis will not prevent APC from caching the file if the file is not yet in memory; it will simply keep APC from checking for an updated file each and every time it executes it.
There are some obvious benefits to this. First, you get a performance boost. This method is used by Facebook to improve the performance of its servers. Clearly your production systems are not updated on an exceptionally regular basis, meaning that you don’t have to worry about the files becoming stale in memory.
As with all things, there are also some drawbacks. First, a restart of the webserver or a call of apc_cache_clear() is required before the APC cache will be cleared and any changes will be reflected. This means that you may want to run APC with apc.write_lock in your INI file to keep your servers from crashing. Also, this will not improve poorly written or poorly executed code; bad code will still run slowly, no matter how many boosts you give it. This will only help well-written code run faster. APC is also only effective if the memory is not completely full, so running APC on heavily-visited sites with relatively little memory won’t earn you much of a boost at all. Finally, if you implement this on a low-traffic website you’re unlikely to see any improvement at all; the higher the traffic, the more this will benefit the web server.
As with all things that promise performance improvements, you are encouraged to profile your code and benchmark the improvement before deploying to a production environment.
Warning: Do not run this in a development environment; otherwise you will be unable to see your changes reflected! This is a production feature only.
How many of us have seen this example in code we’ve worked on?
<?php require '/path/to/lib/DatabaseI.php'; require '/path/to/lib/Database.php'; require '/path/to/lib/Authenticate.php'; require '/path/to/lib/User.php'; require '/path/to/lib/BlogEntry.php'; require '/path/to/lib/Comments.php'; require '/path/to/lib/TemplateLoader.php'; ?>
…and well you get the point…
If you’ve been in the business for any length of time you’d recognize this because almost every single PHP developer does it at one point or another.
Until they learn about SPL’s autoload functions, that is.
There’s a nifty feature built into PHP called spl_autoload_register(). This nifty function allows us to define a callback function that we can then use to automatically load things.
For example, we can do this:
<?php
set_include_path(realpath('/path/to/lib') . PATH_SEPARATOR . get_include_path());
function autoload($className)
{
require($className . '.php');
}
spl_autoload_register('autoload');
$Database = new Database();
?>
That’s it! That’s all there is to defining your own autoloader. This autoloader will then try to automatically get the files needed to instantiate the classes you’re calling.
This has the benefit of not loading excess code that you may never need, which will improve your performance and the time it takes to execute your scripts. In fact, some people have found major performance improvements by using the autoloader.
Now someone is probably thinking “wait, my classes are organized into libraries and placed into their directories. I don’t have to define all of those directories, do I?”
The short answer is no. The long answer is that the autoloader allows us to define any rules we like, so we can do any number of things. Some of my favorite solutions:
* Build a list when the application runs the first time, and allow the autoloader to rebuild its list if something isn’t found, and then fail gracefully.
* Build a list on the fly, and store that in memory (memcached, APC, etc.)
* Have a hard-coded list, and throw an exception if a class outside that list is called.
* Apply the PECL naming conventions, and use a regular expression or explode() to derive the file path.
Because the spl_autoload_register() function lets you write any logic you like, you can make your rules as stringent or as loose as you like.
So spl_autoload_register() in, long sets of include and require statements out!
I gave a talk at OSCON 2009 called “XDebug Your Code: Tips and Tricks for Writing Bug-Free High Impact Code.” The talk included slides, which are posted here for people to download and use.
Enjoy!
One of the biggest challenges in OOP programming with PHP is the ability to pass around objects and let other objects use them. This challenge can be solved with careful design, however. Here we will discuss the registry pattern, not a member of the GoF’s original patterns but still an important pattern nonetheless.
Take the following code example:
<?php
class DatabaseObj
{
protected $_resource;
public function __construct($user, $pass, $db)
{
// Connect to the database.
$this->_resource = $returnedResource;
}
public function getData($string)
{
// Query the database
return $resultResource;
}
public function clean($var)
{
// Do cleaning here.
return $varClean;
}
}
class Authenticate
{
protected $dbObj;
public function checkCredentials($user, $pass)
{
$userClean = $dbObj->clean($user);
$passClean = $dbObj->clean($pass);
return $dbObj->getData('SELECT * FROM users WHERE user = "' . $userClean . '" AND pass = MD5("' . $passClean . '")');
}
}
This is fairly straight-forward, right? Well, no. We want to use our defined database object. However, we have to supply that object to Authenticate in order to use it (and avoid getting an error). There are a few ways to do this. We could do this:
<?php
class Authenticate
{
protected $dbObj;
public function __construct($user, $pass, $db)
{
$this->dbObj = new DatabaseObj($user, $pass, $db);
}
public function checkCredentials($user, $pass)
{
$userClean = $this->dbObj->clean($user);
$passClean = $this->dbObj->clean($pass);
return $this->dbObj->getData('SELECT * FROM users WHERE user = "' . $userClean . '" AND pass = MD5("' . $passClean . '")');
}
}
?>
This would certainly give us a database object to use. However, if we have six or seven objects in our script, creating a new database object each time means that you’d have six or seven MySQL connections by the end of the script, which is not optimal! Also, if you ever change the signature of the constructor in the database object, you have to change it everywhere else that instantiates the object, which could be every single object. This is a problem.
We could also turn the database class into a singleton, and make our code look like this:
<?php
class DatabaseObj
{
protected static $thisObj = false;
protected $_resource;
protected $user = 'me';
protected $pass = 'P@ssw0rd!';
protected $db = 'mydb';
protected function __construct($user = null, $pass = null, $db = null)
{
// Connect to the database.
$this->_resource = $returnedResource;
}
public function getData($string)
{
// Query the database
return $resultResource;
}
public function clean($var)
{
// Do cleaning here.
return $varClean;
}
public static function getConnection($user = null, $pass = null, $db = null)
{
if((self::$thisObj instanceof DatabaseObj))
return self::$thisObj;
self::$thisObj = new DatabaseObj($user, $pass, $db);
return self::$thisObj;
}
}
class Authenticate
{
protected $dbObj;
public function __construct()
{
$this->dbObj = DatabaseObj::getConnection();
}
public function checkCredentials($user, $pass)
{
$userClean = $this->dbObj->clean($user);
$passClean = $this->dbObj->clean($pass);
return $this->dbObj->getData('SELECT * FROM users WHERE user = "' . $userClean . '" AND pass = MD5("' . $passClean . '")');
}
}
?>
Using this design pattern will ensure that you always get the same database object no matter what. This is exactly the behavior we want: it ensures that we never establish more than one connection and it takes care of the signature problem (albeit not completely). Let’s take a moment to discuss how and why this works.
In PHP 4, objects were essentially arrays with functions associated with them. That changed in PHP 5, when objects became more like real objects, being stored in memory and returning a “pointer” to the lookup table (for more on how objects work in PHP5 you should read Sara Golemon’s blog post "/>
Truncated by Planet PHP, read more at the original (another 5382 bytes)
One of the best features of PHP’s object model (and really all object models) is the concept of inheritance – that is, derived classes inherit the members and methods of their parents. This is a fantastic way to further encapsulate and abstract your code because it means you can define some base functionality and then later on extend that class to add new functionality and even override existing functionality to make the class specific.
But this concept is a double-edged sword in PHP (and all other languages). Here’s where multiple inheritances can kill you.
Imagine that your source code has the following…
<?php
abstract class BaseClass
{
// Some code here.
}
abstract class AddSomeFunctionality extends BaseClass
{
// More basic functions.
}
class DoSomeWork extends AddSomeFunctionality
{
// Actual work done.
}
class DoSomeMoreWork extends DoSomeWork
{
// More work done; override some of the methods in prior classes.
}
On its face this might not look too terribly complex. It’s not until you actually get into the code that you start to have some problems. Imagine the following function:
<?php
function Execute(DoSomeWork $object)
{
$object->runMyFunc();
}
$object = new DoSomeMoreWork();
Execute($object);
What happens here? Does this work? Absolutely; the way inheritance works is that every class that is part of the family is considered an “instanceof” for our purposes. But imagine that runMyFunc() had been defined in AddSomeFunctionality and then overridden in DoSomeMoreWork. You’re going to get unpredictable behavior by passing it an object of DoSomeMoreWork, even though it passes the typehinting check.
This is certainly a problem that can be solved through wise coding and understanding how inheritance works. It’s surely true that you could institute a simple check using instanceof; however, if you ever some day extend DoSomeWork with a new class (say called DoMyWork) then the check will fail just as it does above.
The better solution is to be cautious with inheritance. Developers owe it to themselves and to their peers to master the concepts of “is-a” versus “in a”. Objects that are of the same family should extend one another; objects that are not part of the same family should be given to one another but should not extend one another.
An example:
<?php
abstract class Fruit {}
abstract class TreeFruit extends Fruit {}
abstract class SummerFruit extends TreeFruit {}
class Cherry extends SummerFruit {}
/* Now for a class that shouldn't be in the family */
class FruitSalad extends Cherry {}
In this example, we have fruit. We extend that into tree fruits (cherries, peaches, pears, etc.) and furthermore we extend it into summer fruits. From summer fruits we could go any number of directions; I chose cherries (because I love cherries). But we could also have a class of peaches, plums, etc. that would classify as summer tree fruits. But as soon as we extend cherry to fruit salad, we create a problem. This is where inheritance can kill you. A fruit salad is not a fruit (our base class); a fruit salad is composed of fruits. Let’s take a look at a better object model.
<?php
/** Assume that all classes not redefined here exist from our last example **/
final class Cherry extends SummerFruit {}
class FruitSalad
{
protected $_fruits;
public function addFruit(Fruit $fruit)
{
if(!is_array($this->_fruits))
$this->fruits = array();
return $this->_fruits[] = $fruit;
}
This is a much better object model; instead of trying to extend the Cherry class we can have a full fruit salad (composed of any other fruits we might define). FruitSalad may not inherit the methods of Cherry, but it will be able to access the Cherry API (the public methods).
Inheritance is a glorious and useful tool, but used improperly it creates terrible problems. Good design, smart architecture, and understanding what is a member of the family and what is not will help ensure that the programmers who come after you will be more easily able to understand your code.
One of the best features of PHP’s object model (and really all object models) is the concept of inheritance – that is, derived classes inherit the members and methods of their parents. This is a fantastic way to further encapsulate and abstract your code because it means you can define some base functionality and then later on extend that class to add new functionality and even override existing functionality to make the class specific.
But this concept is a double-edged sword in PHP (and all other languages). Here’s where multiple inheritances can kill you.
Imagine that your source code has the following…
<?php
abstract class BaseClass
{
// Some code here.
}
abstract class AddSomeFunctionality extends BaseClass
{
// More basic functions.
}
class DoSomeWork extends AddSomeFunctionality
{
// Actual work done.
}
class DoSomeMoreWork extends DoSomeWork
{
// More work done; override some of the methods in prior classes.
}
On its face this might not look too terribly complex. It’s not until you actually get into the code that you start to have some problems. Imagine the following function:
<?php
function Execute(DoSomeWork $object)
{
$object->runMyFunc();
}
$object = new DoSomeMoreWork();
Execute($object);
What happens here? Does this work? Absolutely; the way inheritance works is that every class that is part of the family is considered an “instanceof” for our purposes. But imagine that runMyFunc() had been defined in AddSomeFunctionality and then overridden in DoSomeMoreWork. You’re going to get unpredictable behavior by passing it an object of DoSomeMoreWork, even though it passes the typehinting check.
This is certainly a problem that can be solved through wise coding and understanding how inheritance works. It’s surely true that you could institute a simple check using instanceof; however, if you ever some day extend DoSomeWork with a new class (say called DoMyWork) then the check will fail just as it does above.
The better solution is to be cautious with inheritance. Developers owe it to themselves and to their peers to master the concepts of “is-a” versus “in a”. Objects that are of the same family should extend one another; objects that are not part of the same family should be given to one another but should not extend one another.
An example:
<?php
abstract class Fruit {}
abstract class TreeFruit extends Fruit {}
abstract class SummerFruit extends TreeFruit {}
class Cherry extends SummerFruit {}
/* Now for a class that shouldn't be in the family */
class FruitSalad extends Cherry {}
In this example, we have fruit. We extend that into tree fruits (cherries, peaches, pears, etc.) and furthermore we extend it into summer fruits. From summer fruits we could go any number of directions; I chose cherries (because I love cherries). But we could also have a class of peaches, plums, etc. that would classify as summer tree fruits. But as soon as we extend cherry to fruit salad, we create a problem. This is where inheritance can kill you. A fruit salad is not a fruit (our base class); a fruit salad is composed of fruits. Let’s take a look at a better object model.
<?php
/** Assume that all classes not redefined here exist from our last example **/
final class Cherry extends SummerFruit {}
class FruitSalad
{
protected $_fruits;
public function addFruit(Fruit $fruit)
{
if(!is_array($this->_fruits))
$this->fruits = array();
return $this->_fruits[] = $fruit;
}
This is a much better object model; instead of trying to extend the Cherry class we can have a full fruit salad (composed of any other fruits we might define). FruitSalad may not inherit the methods of Cherry, but it will be able to access the Cherry API (the public methods).
Inheritance is a glorious and useful tool, but used improperly it creates terrible problems. Good design, smart architecture, and understanding what is a member of the family and what is not will help ensure that the programmers who come after you will be more easily able to understand your code.
The explosion of the concept of “web services” has generated a debate over what “web services” actually are. An article by Raj Mishra tries to limit the concept of “web service” to a strict ten-point list, insisting that a web service have a WSDL and uses SOAP. While this is a perfectly fine sentiment (even the one endorsed by WC3), it certainly is a limiting description.
The reality is that the definition of “web service” has grown, and Mr. Mishra’s list is both inaccurate and misleading.
While the W3C description defines a web service as using SOAP and having a WSDL, it also states that “There are many things that might be called “Web services” in the world at large. … without prejudice toward other definitions, we will use the following definition.” In other words, W3C’s definition is neither comprehensive nor exclusive.
Some of the world’s most popular web services no longer force use of WSDL or SOAP. Flickr, Twitter, Facebook and even Google allow access to their data and services without using SOAP, either by way of XML, REST, JSON-RPC or XML-RPC. And when the big players decide that their definition of “web service” includes these things, that really breaks any definition issues by anyone else.
So perhaps it’s time now for a new definition of “web service” starting. Here’s mine.
A web service is any service that meets the following criteria:
How does a service like say, Twitter, follow these? First, it provides an endpoint for both the addition and retrieval of information (in XML and JSON). Second, it contains more than one documented method for doing so. Third, almost the whole point of Twitter is to access it from somewhere OTHER than the web – SMS, desktop applications, etc. Thus, Twitter would qualify as a web service, whether you use their WSDL or their REST API.
The reality is that as much as we might want to constrain concepts to a particular corner, the major players and the world at large will determine what qualifies as a “web service” and what does not. SOAP is one of many different ways to consume and access web services but it is by no means the sole way to do so. Limiting the concept of “web services” to a definition issued by W3C, who themselves acknowledge that there may be other “web services” that don’t fit their definition, just makes you look silly.
Last December, I wrote about the use of PHP superglobals inside of classes (link here). I asserted at the time that superglobals inside of a class violated some basic rules on what a class was supposed to do. Today, I am revisiting that discussion.
The placement of superglobals inside a class creates an impossible situation for code reuse. Take for example my original sample:
<?php
class VerifyLogin extends UserObject
{
function verifyCredentials()
{
$username = mysql_real_escape_string($_POST['username']);
$password = mysql_real_escape_string($_POST['password']);
$passwordhash = MD5($username . $password); // Salt our PW Hash
$sql = 'SELECT id FROM userTable WHERE username = ' . $username . ' AND password = ' . $passwordhash . ' LIMIT 1';
$resource = mysql_query($sql);
if(mysql_numrows($resource) > 0)
{
$array = mysql_fetch_row($resource);
$this->userID = $array[0];
$this->loggedIn = true;
return true;
}
else
return false;
}
}
?>
This code will work just fine for our purposes – on this site. But what happens when we want to move this to another site? Unless we leave our form fields named “username” and “password” we’ll have to modify the original code. This makes reusing the code more difficult. Since user validation is particularly frequent, there’s no reason why we should be rewriting this code each and every time we create a new application.
There is a way to refactor this code to make it easier to reuse later on (note that my re-factoring simply is addressing superglobals, and no other facet of the code):
<?php
class VerifyLogin extends UserObject
{
function verifyCredentials($username, $password)
{
$passwordhash = MD5($username . $password); // Salt our PW Hash
$sql = 'SELECT id FROM userTable WHERE username = ' . $username . ' AND password = ' . $passwordhash . ' LIMIT 1';
$resource = mysql_query($sql);
if(mysql_numrows($resource) > 0)
{
$array = mysql_fetch_row($resource);
$this->userID = $array[0];
$this->loggedIn = true;
return true;
}
else
return false;
}
}
$ValidUser = new VerifyLogin($_POST['username'], $_POST['password']);
if($validUser)
{
// Do something here
}
else
{
// Do something else
}
?>
This refactored code means that we can use our class anywhere we like, for anything we like, without having to really care what our $_POST array looks like. We simply have to supply the data to the method, and it will do the work for us.
Object-oriented programming in PHP allows us to do amazing things like build frameworks and libraries that can be used over and over again. We shouldn’t muck that up by including short-term or application-specific functions and variables when they’re unnecessary.
Warning: The above code has security holes and should never be used without refactoring. It is intended for demonstration purposes only.
We live in a technological world, one that allows us to communicate instantly. Through email, instant message, IRC, Twitter, Facebook, and tons of other platforms, a message we have to share can be spread to anyone in the world almost instantly. So it would seem odd, then – in fact, almost paradoxical – that as a world we’re less interconnected than ever. But we are.
How is this even possible?
As the world has become faster, the number of communiques we receive each day has increased. And yet the value perceived from those communiques has declined. In addition, the methods we used to use (sending notes, cards, thank yous, and wedding invitations) have become less common, as we’ve tried to replace them with electronic means, without realizing that the value of these things is in their personality.
No electronic note can replace the feeling of good paper and a hand-written message. And yet we’ve tried, desperately, to reduce the burdens of buying paper, stamps and ink. When a true note is exchanged between friends or lovers, there’s a magic to it that is lost in the electronic form. A card from a long absent friend, or a letter from a parent, all these things have an electronic counterpart, but no electronic equivalent.
My girlfriend and I regularly exchange love notes. She gave me stationary as a gift, a long-standing holdover of another generation but relevant for this one, too. I’m certain the postman has back problems, for all the mail she receives, of personalized cards, postcards, and letters. But she’s actually taken the time to cultivate relationships with friends, and she’s joyful every time she gets a letter. A technophobe she is not; she even has a Twitter account. She just knows the value of getting something in the mail.
This long-lost art doesn’t have to be either. Instead, we can all do better. A hand-written thank you note or a postcard from a trip is all it takes. The truth is that once you take the time to write, others will reciprocate. And they will appreciate you, too.
With the release of PHP 5.3 to the world, I wanted to be one of the first to try it. The problem is that the typical package managers for Ubuntu won’t include PHP 5.3 for some time – perhaps as long as a year. This is a problem, since I really want to try PHP’s latest and greatest features for myself.
The problem is, there seems to be a lack of clear, coherent instructions online about compiling PHP on Ubuntu from source. Either it’s so insanely simple that anyone who does it figures everyone else knows how, or everyone relies on the pre-built binaries released by the world that any time I search for “Ubuntu PHP source” I get “why don’t you just use the built-in package manager?” And so, I wanted to write a set of instructions for how I configured and compiled PHP, on Ubuntu Jaunty.
For the most part we’ll use standard out-of-the-box packages that Ubuntu provides. Since the features I’m looking for are in PHP 5.3, I’m fine with having slightly outdated packages for other software as long as they’re compatible with PHP 5.3.
1 Install Apache.
The first step is to install Apache. You can feel free to use your own web server, but for this example I opted for Apache based on the simplicity of installing PHP as an Apache module. This command should also install the development tools, which PHP will need to properly configure itself for Apache.
aptitude install apache2 apache2-mpm-prefork apache2-prefork-dev apache2-utils apache2.2-common2. Install MySQL and PostgreSQL
Most people use MySQL, but almost all PHP installations seem to have PostgreSQL installed as well. I think it’s always good to keep your options open, and you might take a look at Postgres at some point. If you opt not to install Postgres, that’s fine; just alter your configure statement to exclude it.
aptitude install postgresql-8.3 postgresql-client-8.3 postgresql-client-common postgresql-common postgresql-server-dev-8.3aptitude install mysql-client mysql-client-5.0 mysql-common mysql-server mysql-server-5.0 mysql-server-core-5.0
3. Install required libraries
PHP itself is very easy to configure and install. However, if you want to get the most out of it you’ll need to install the dependency libraries it uses for various functions like mcrypt and gd. This can be the most challenging part of installing PHP; many hours can be spent issuing a configure command only to have it fail, have to install another library, and repeat the process.
These libraries are the libraries that were needed to install the options in the configure statement below. You may need more or fewer libraries depending on your individual needs and desires.
aptitude install libtidy-dev curl libcurl4-openssl-dev libcurl3 libcurl3-gnutls zlib1g zlib1g-dev libxslt1-dev libzip-dev libzip1 libxml2 libsnmp-base libsnmp15 libxml2-dev libsnmp-dev libjpeg62 libjpeg62-dev libpng12-0 libpng12-dev zlib1g zlib1g-dev libfreetype6 libfreetype6-dev libbz2-dev libxpm4-dev libmcrypt-dev libmcrypt44. Download the PHP Source Code
Visit http://www.php.net and download the latest version of PHP 5.3. The following commands may be helpful:
cd ~5. Configure.
Change your working directory to the unpacked PHP source code. Execute the following command.
./configure –with-apxs2=/usr/bin/apxs2 –with-mysql=/usr –with-mysqli=/usr/bin/mysql_config –with-pgsql=/usr –with-tidy=/usr –with-curl=/usr/bin –with-curlwrappers –with-openssl-dir=/usr –with-zlib-dir=/usr –enable-mbstring –with-xpm-dir=/usr –with-pdo-pgsql=/usr –with-pdo-mysql=/usr –with-xsl=/usr –with-ldap –with-xmlrpc –with-iconv-dir=/usr –with-snmp=/usr –enable-exif –enable-calendar –with-bz2=/usr –with-mcrypt=/usr –with-gd –with-jpeg-dir=/usr –with-png-dir=/usr –with-zlib-dir=/usr –with-freetype-dir=/usr –enable-mbstring –enaTruncated by Planet PHP, read more at the original (another 4145 bytes)
Shortly after my lunch I saw the following tweet from David Pogue, technology columnist for the The New York Times.

Given Pogue’s large following, I was disappointed by the advice he gave.
A secure password on a laptop isn’t to keep semi-trusted people off of it. It’s to keep it protected in the event that the hard drive is lost. Arguably, the drive could be removed and read without booting it, removing all password protection, but a good password combined with disk encryption can help protect data from theft.
Pogue argues that he doesn’t need a password for security, but any information that is stored on file servers or worse, in his keychain, is accessible with that simple password he is using. Failing to have a secure password not only places his data at risk, but the data of those he might know or work with.
Even if gaining physical access (e.g. finding the computer or stealing it) would grant a person a huge advantage, this is not an excuse to make the password so incredibly simple. And for one of the most well-known technology columnists to suggest otherwise to his 444,666 followers is negligent, bordering on criminal.
Today is my last day at The Bivings Group. Yesterday was my last effective day; as I write this, I’m flying over West Virginia on my way to a friend’s wedding in California.
Starting June 22nd, I’ll be working for Applied Security. I’ve learned a lot at The Bivings Group, and worked on a lot of great projects, including The Pickens Plan. I’ll miss the friends I made, but I’m glad that I’ll be able to see them again at the developer’s group meetings and conferences.
Good luck to the folks at The Bivings Group. See you soon.
The PHP Community is a fairly large, rules-free community of people who share a common interest in programming. Many of us hang out on Twitter, our own blogs, or on IRC (usually on Freenode #phpc). So some events of the day certainly caught me by surprise.
This afternoon, while hanging out in a lesser known channel, I was kicked out for no reason besides the whim of the operator, Derick Rethans. No warning, no rude comment on my part, just a joke followed by a kick. Alison Lunde was also kicked for a seemingly bogus reason. Another person was banned.
How could this be, two weeks after the “community conference,” which Derick himself attended?
I was furious. I felt betrayed, violated, attacked personally for something that shouldn’t have happened. I hadn’t been kicked out of an IRC channel since before I was a teenager; this was a personal insult and by God, I was ready to fight. I talked to a number of my friends in the PHP community and told them that this slight was beyond insulting; I was going to tear Derick a new one on my blog and wash my hands of the community forever. After all, everyone in the channel seemed to support this decision, right? They were all still there, hanging out, perhaps hoping they wouldn’t be next.
But a few hours went by, and a few things started to happen. First, a number of people in the community told me that leaving the community would hurt the community overall. Others told me they were beside themselves at Derick’s action, and wouldn’t be participating in any “elite” channels. I got a number of supportive emails. A number of people said “well if we can’t be a part of that channel, we’ll start our own” and did just that. Not to be exclusive - just to reduce the noise level.
As time went on I realized that the community really is larger than any one person. Even when one person does something inconsiderate or rude, that doesn’t and shouldn’t define the community. I was really ready to let Derick have it, to walk away, and to never be a part of the community ever again. But no one person defines the community, and no one person can shape it, no matter who they are. That’s the point of the PHP community.
Instead I’ll practice some forgiveness. Derick, you’re welcome in any channel I’m in, any time. Xdebug is fantastic and your book on dates and times is perhaps the preeminent work on the topic. Your contributions to the core of PHP and its documentation are unmatched by anyone, and with 130 presentations to your name, you certainly are accomplished. You’re a part of the community, and I welcome you.
For those wondering how to join the community, it’s easy. Show up in IRC, or come to a conference. Find your users group in your area, or join one of PHP’s many project mailing lists. If you need direction, Elizabeth Smith (@auroraeosrose) always has direction on work that needs to be done in the community. Dive in, get involved, don’t be a jerk and meet people. Put on a thick skin, watch out for toes, put on your work clothes and get involved.
Tell them I sent you.
One of the greatest things about the PHP community is the willingness of people to help one another.
Picture this: you’ve got developers of all levels, working in the same room. They’ve been tasked with working on various open source projects. Only in PHP do you see expert level developers like Eli White, Matthew Turland and Sara Golemon teaching more inexperienced developers (like myself!)
Also in the PHP community: Sara Goleman was having a debate with a younger programmer who insisted that for loops were slower than while and do-while loops. Upon examining the source she discovered that in fact it was wrong - and she promised to look into it. You won’t get that in the Rails community.
The best thing about conferences is the community around them. Non-assuming giants in code and contributions work alongside newbie developers to solve problems and discuss important issues. People are willing to teach and others are eager to learn, making the PHP community grow.
Day 1 sure started with a bang, with Andrei Zmievski doing a phenomenal keynote on the future of PHP. There were great photos, including the (in)famous “ball of nails” quote from Terry Chay. Then it was off to the breakout sessions.
Eli White knocked one out of the park with “Highly Scalable Web Applications” while Wez Furlong did a great job with “Getting It Done.” Liz Smith rocked the house with “SPL To The Rescue,” a talk that was so amazing it was being talked about at the cocktail party that night.
Off to dinner, I had a great conversation with Snipe, and then we went to the cocktail party where we heard presentations of twenty slides, twenty seconds each. A great “first day” (second for those who came to the tutorials) at php|tek.
Tuesday started off with little fanfare but much anticipation as the tutorial day at php|tek got underway. There were a lot of great tutorials, but I chose to attend the Security Boot Camp by Christian Wenz, and the Subversion tutorial by Lorna Mitchell and Matthew O’Phinney.
Both tutorials were great, with the Subversion tutorial being the more useful of the two for myself. Afterwards we all went out for some deep dish Chicago-style pizza which is an experience in itself. Deep dish Chicago piza is a pizza with a semi-thick crust and about two inches of cheese. The sauce is placed on top of the pizza (rather than between the crust and the cheese) to keep the cheese from drying out. It’s truly an experience.
Today starts the formal conference, and we’ll have some great sessions by Eli White, Chris Shiflett, Elizabeth Smith and others. I can’t wait!
Next week is php|tek in Chicago.
Every day I’ll be blogging, tweeting and generally writing about how the conference is going. It will be fresh, fast, and relevant.
Also, in conjunction with John Bafford and my company, The Bivings Group, we’ve set up an auto-updating php|tek Twitter feed at [twitter.bivings.com] . Note: Internet Explorer doesn’t work 100% well with this site, but Firefox, Chrome and Safari work perfectly.
So come and join us, in person, on Twitter, online. We’ll see you there!
Thanks everyone for a great Wordcamp Mid-Atlantic and for attending Wordpress Caching! Here are the slides so you can download them for your own resource.
Remember, these are licensed under the Creative Commons license: [creativecommons.org] or compatible license such as GPL. Please attribute me and don’t sell this content!
Download the slides here: Wordpress Caching
My article from last week, “On Code Commenting and Technical Debt” raised a lot of response throughout the community. I think that discussion is great, and I’m all for a debate that enhances the community. But I feel as though my argument has been taken a bit out of context.
To that end, here are five things that I believe commenting is and five things that commenting is not.
5 Things Commenting IS NOT:
5 Things Commenting IS:
Truncated by Planet PHP, read more at the original (another 1592 bytes)
If you’ve been in development for any length of time, you’ve probably come across a project or a company that doesn’t take the time to incorporate comments into its code. The argument is often made that “our code is self-documenting” and that commenting is just a waste of time, especially if you write clear, clean code. But I disagree.
Some people take a different view of commenting than others. Most recently, Eli White wrote an article called “Commenting on Commenting” in which he argued for commenting virtually every single line of code. He talked about working for a company where they stripped the code out and turned the comments into line-by-line documentation. But commenting to me is important for a different reason:
Failure to comment is simply the accrual of technical debt.
Technical debt is the cost of fixing things that you didn’t fix at a particular point because it would have been too costly. This cost can be in time or money, but represents additional effort that must be made. How does this apply to commenting?
Assume it takes one additional hour per file to write clear, complete comments. To save that hour now, you do not comment your code (before or after you write it), delivering the product faster. However, the next developer to come along and work on that code is going to have to learn about it. Assuming that it would take them an hour to learn about it with comments, and four hours to learn about it without comments, the two hours you saved by not commenting (the hour to comment, and the hour to read them) has cost you four hours of a developer relearning your code, for a total cost of 2 extra hours.
This clearly isn’t worth it.
Undoubtedly somewhere out there someone is going to say something like “wait a minute, Brandon! What if it takes less time?” If it actually takes less time, fine. You got lucky. Think back on your prior experiences, though. The cleanest code you’ve ever seen written by somebody else - how long did it take you to learn? The problem comes not because we write bad code or are sloppy in our design. It comes because we’re human beings and we have to understand the logic before we can modify it. It’s the same reason a doctor who specializes in the heart must understand the other body systems before he can do surgery - even if he never treats any other organs, he must understand how they work together. In the same way, a programmer must understand the logic behind code before they can modify it, and failure to adequately justify your architecture incurs a cost.
This myth of self-documenting code needs to be squashed once and for all. Commenting is important, and unless your code does nothing other than display “Hello World!” it’s going to need some explanation.
Marco Tabini, a man I deeply respect, posted a rebuttal to my argument here. I recommend you read it, as he made some really great points.