When you design a software product or feature you need to consider not only what the software will do, but also how it will interact with the user. The functional requirements for the software typically refer to what the software does. Nonfunctional requirements clarify the parameters for how the software will meet the functional requirements. Common nonfunctional requirements include things like reliability, availability, security, safety, usability, programmability, maintainability and performance. All of these are important, but your software's performance will have a disproportionate impact on how your software will feel when people use it. I think Apple put it well in their Apple Human Interface Guidelines:
"Performance is the perceived measure of how fast or efficient your software is and it is critical to the success of all software. If your software seems slow, users may be less inclined to buy it. Even software that uses the most optimal algorithms may seem slow if it spends more time processing data than responding to the user. ... Remember that the perception of performance is informed by two things: The speed with which an application processes data and performs operations and the speed with which the application responds to the user." 
While performance is one of the most important nonfunctional requirements, it's often the most difficult to define. For new features it's difficult to know where to set the performance goal because there's not always some similar functionality to compare it against. Further, how would you define "slow" or "fast" in an objective and verifiable way? Confronted with this problem most software engineers simply skip this section of requirements with the justification, "If it's too slow, I'll see it and we'll fix it then. I know slow when I see it." If performance is specified, often some arbitrary time limit is set with little reasoning behind the performance goal.
Failing to specify reasonable performance requirements makes it very difficult to verify that your software is actually meeting your users' performance expectations. And what are these user expectations any way? How can you determine what makes one piece of software fast and the other slow? Understanding a little about the psychology of time perception can answer these questions. Armed with this understanding you can specify, design and build for performance from the very beginning and this tremendously improves the chances of a high performance solution.
Any human computer interaction can be thought of as a conversation between the human and the computer. The user does something and the software and hardware respond to that request. The time it takes the system to respond to the request of the user is the system response time.
There as been quite a bit of research done in the area of system response times. In 1968 R. B. Miller wrote a paper titled "Response Time in Man-Computer Conversational Transactions."  The Department of Defense created MIL-STD 1472F  a 219-page document titled "Department of Defense Criteria Standard: Human Engineering (revision F)" which describes many of the non-functional requirement standards for use in the military. Sponsored by the US Air Force the MITRE Corporation published in 1968 a document titled "Guidelines for Designing User Interface Software".  In 1996 the Department of Defense created an eight-volume work entitled "Technical Architecture Framework for Information Management (TAFIM)" and the last volume in this work includes guidelines for response times .
All of these standards for system response times include descriptions of types or classes of actions by the user and the guidance for the acceptable response time by the system. More recently Steven C. Seow published a book titled "Designing and Engineering Time".  This excellent book describes in detail some of the important considerations for defining appropriate response times and simplifies the combined recommendations of previous research into a simple framework. The general framework Seow suggests is as follows:
Instantaneous (0.1 to 0.2 seconds)
Immediate (0.5 to 1.0 seconds)
Continuous (2 to 5 seconds)
Captive (7 to 10 seconds)
You use these performance guidelines by asking yourself the question: For this feature, what is the users expectation for response time? Is the user expecting an instantaneous response? If so, then you know your software should respond within 0.1 to 0.2 seconds.
So what is slow? These response time categories provide a powerful answer to this question. Slow is when a user expects an immediate response, within 0.5 to 1.0 seconds, and they get a continuous response, somewhere from 2 to 5 seconds! Slow is when a user expects an instantaneous response, within 0.1 to 0.2 seconds, and they get an immediate response, somewhere from 0.5 to 1.0 seconds!
Note: Can a response be too fast? Yes, a good example of this too fast response is when a user starts a software installation and the install completes immediately. The reality of the response time doesn't conform to their expectation and will cause the user to think the install didn't work properly.
Performance is a perceived reality based on the conversation between the human and the computer. As Qui-Gon Jinn said to Anakin, "Remember: Your focus determines your reality."  This is especially true with software performance. What the user is focused on is not the performance of your application, at least not initially. They are focused on doing something with your application as the means to an end. When the application is responding to their commands appropriately, this becomes a natural conversation between the human and the computer and turns into a state of flow where the user is happy and productive. The challenge for a software developer is to maximize the probability that your software will disappear from the focus as they are enabled to enter that zone of creativity. Response times that are too fast or too slow and disrupt the user's state of flow degrade the user experience.
Sadly, brilliant architecture doesn't matter if the user feels like you're wasting their time or something is wrong. They will feel like something is wrong if you don't ensure that the system responds within the expected time frame. Identifying the areas in your software that conflict with a user's expectations is the first step in making your software feel fast and responsive. Putting the user at the center of this question is the key to building high performance software. Let's dive in to what each of these response time categories means in detail.
Note: Steven C. Seow has long studied the distortion of time perception. I met him shortly after he joined Microsoft. Recently he has released his first book, "Designing and Engineering Time". It's a fantastic book, one I own and highly recommend. While I try to summarize some of his ideas on responsiveness, he goes into much more detail in his book and I recommend going there for a more complete understanding.
When a user moves a mouse or clicks a button the expectation is that the software will respond instantaneously, that is at least within 0.1 to 0.2 seconds. The easiest way to determine if a part of your software falls into this category is if the interaction mimics some object in the physical world that also has an instantaneous response. Most forms of user input fall into this category. Clicking a menu and waiting for it to drop down or dragging a slider are all examples of where an instantaneous response is expected. If you have ever opened your Mac laptop from sleep, and tried to click the Airport menu and had it hesitate and then display, you have experienced the problem where the expectation for an instantaneous response is not fulfilled.
The best example of an immediate response, between 0.5 and 1.0 seconds, is scrolling a window. The user's mental model is that the data has already been "loaded" so telling the computer to display a different section carries the expectation that it should occur immediately. The detailed operations behind any user interaction are hidden from the user. For example, fetching and rendering large documents often involve paging in and out memory, but this is invisible to the user. When they move to the next page they expect the response to be immediate since their mental model tells them all the "hard work" as been done when the document was first loaded. The expectation is what matters. In this realm of response times, the key is to communicate to the user that the request or command has been received and if the action is simple, a complete response is returned in less than 1 second. Animation can go a long way in avoiding awkward pauses in the response between the system and the user. The iPhone's checkered back screen when scrolling a web page is a good example of immediate feedback, while dealing with real hardware and software constraints.
Unless the user is expecting an instantaneous or immediate response there is generally recognition that the computer needs to "think" about doing stuff. Miller wrote:
If you address another human being, you expect some communicative response within x seconds-perhaps two to four seconds. ... In conversation of any kind between humans, silences of more than four seconds become embarrassing because they imply a breaking of the thread of communication. 
In the continuous category, between 2 and 5 seconds, it can be helpful and calming to let the user know that the computer is "thinking." Progress bars are often helpful in this case, but not required. On the Mac, both Keynote and PowerPoint use progress bars to inform the user that work is being done when loading documents. When the user asks your software to do something moderately complex, the continuous response will be the expectation.
In the 7 to 10 second response range, users need to see real progress and visual response. I like to think of this as the captive audience range. You will have a user paying attention to what's going on in this range, but anything that takes longer than this, they'll move on to something else and come back to see progress later. A good example of this is downloading a fast start movie online. The user's attention span is about 10 seconds, so if your process takes longer than that you'll need to provide significant visual feedback to what's going on and be certain to give the user the ability to move on to other things.
The Process of Setting Performance Goals
For each user interaction in your software ask yourself if the user is expecting an Instantaneous response, (0.1 to 0.2 seconds) an Immediate response (0.5 to 1.0 seconds) a Continuous (2 to 5 seconds) or a Captive response (7 to 10 seconds). This will set the range of response times for that part of the system. Use the appropriate response time range as your performance goal. This will give the developer a basic understanding of where the performance for that feature needs to be and allow the tester to test for system responsiveness from the beginning.
Most features are easily classified into one of the four categories, but sometimes it's hard to tell. In this case usability studies can help inform you if your best guess was wrong.
When you need to choose which part of your application to focus on speeding up, understanding where and why users will perceive performance problems is key. You can't and shouldn't optimize everything. Remember, perception is reality. No mater what your metrics say, if the user thinks your application is slow, it is.
Objectively measured durations don't mean anything without a corresponding benchmark that shows what a user expects. They will judge your software against their expectations. You need to identify what kind of expectations the user has for each stimulus and response in your application and make your software response times meet these expectations. Users have four general categories of expectations: Instantaneous (0.1 to 0.2 seconds), Immediate (0.5 to 1.0 seconds), Continuous (2 to 5 seconds) and Captive (7 to 10 seconds). The more areas in your application where the users expectation are met with your application's actual response the faster the application will feel.
Maister's First Law of Service  states that the key to satisfaction is the delta between what was expected and what was perceived. If the perception is that your software performs better than expected, satisfaction will be high, but if the perception is that your software performs worse than expected, satisfaction drops. Perceived durations and actual durations along with an understanding of the users' tolerance for both will allow you to carefully design software to meet and exceed user expectations.
 Apple (2008). Apple Human Interface Guidelines. Available online at http://developer.apple.com/documentation/UserExperience/Conceptual/AppleHIGuidelines/OSXHIGuidelines.pdf page 31, 57
 Miller, R. B. (1968). Response time in man-computer conversational transaction. Fall Joint Computer Conference U.S.A. 267-277.
 Department of Defense Design Criteria Standard: Human Engineering. MIL-STD 1472F. Available online at http://hfetag.dtic.mil/docs-hfs/mil-std-1472f.pdf
 Smith, S. L. and J. N. Mosier (1986). Guidelines for Designing User Interface Software: ESD-TR-86-278. Bedford, MA: The MITRE Corporation.
 Department of Defense Technical Architecture Framework for Information Management (TAFIM). Volume 8: DoD Human Computer Interface Style Guide.
 S.C. Seow, Designing and Engineering Time: The Psychology of Time Perception in Software, Addison-Wesley Professional, 2008.
 Star Wars - Episode I, The Phantom Menace, 20th Century Fox, 2005.
 Maister, D. H. (1985). The psychology of waiting lines. In Czepiel (Ed.), The Service Encounter. Lexington, MA: Lexington Books. 113-123.