It has been more than yo years sincer Johari laperot and Jeffrey Limun fint published this classic book on formal languages, automata theory and. Computational. Introduction to automata theory, languages, and computation / by John E. Hopcroft, First, in , automata and language theory was still an area of active. Introduction to Automata Theory Languages and Loading latest commit This file is too big to show. Sorry! Desktop version.

Introduction To Automata Theory Languages And Computation Pdf

Language:English, Indonesian, German
Country:Dominican Republic
Genre:Fiction & Literature
Published (Last):23.02.2016
ePub File Size:17.62 MB
PDF File Size:17.21 MB
Distribution:Free* [*Sign up for free]
Uploaded by: MARCELINE

Request PDF on ResearchGate | On Jan 1, , John E. Hopcroft and others published Introduction to automata theory, languages, and computation - (2. ed.). Equivalence of regular expressions and regular languages 56 . Introduction to Automata Theory, Languages, and Computation (third edition), by John. Intro to Automata Theory, Languages and Computation - dokument [*.pdf] INllHHXXTION AL'lttMATA TIH-'OKY, Kan k ImnncK Kmn D bl I viy INTRODUCTION TO.

Chapter+5+solution.pdf - Introduction to Automata Theory...

Our conclusion is that S X is true for all X. The next two theorems are examples of facts that can be proved about trees and expressions. The formal statement S T we need to prove by structural induction is: The basis case is when T is a single node.

The nodes of T are node N and all the nodes of the Ti 's. The edges of T are the k edges we added explicitly in the inductive de nition step, plus the edges of the Ti 's. Hence, T. Thus, T has one more node than it has edges. If G is de ned by the basis, then G is a number or variable. These expressions have 0 left parentheses and 0 right parentheses, so the numbers are equal.

There are three rules whereby expression G may have been constructed according to the inductive step in the de nition: We may assume that S E and S F are true that is, E has the same number of left and right parentheses, say n of each, and F likewise has the same number of left and right parentheses, say m of each. Then we can compute the numbers of left and right parentheses in G for each of the three cases, as follows:.

In each of the three cases, we see that the numbers of left and right parentheses in G are the same. This observation completes the inductive step and completes the proof.

Automata theory provides many such situations. In Example 1. These statements tell under what sequences of inputs the automaton gets into each of the states. However, when there are really several independent statements to prove, it is generally less confusing to keep the statements separate and to prove them all in their own parts of the basis and inductive steps.

We call this sort of proof mutual induction. An example will illustrate the necessary steps for a mutual recursion. The automaton itself is reproduced as Fig.

Since pushing the button switches the state between on and o , and the switch starts out in the o state, we expect that the following statements will together explain the operation of the switch:.

The automaton is in state o after n pushes if and only if n is even. The automaton is in state on after n pushes if and only if n is odd. We might suppose that S1 implies S2 and vice-versa, since we know that a number n cannot be both even and odd.

However, what is not always true about an automaton is that it is in one and only one state. It happens that the automaton of Fig.

Repeat of the automaton of Fig. The proofs depend on several facts about odd and even integers:. Since that is the start state, the automaton is indeed in state o after 0 pushes.

Thus, this part of the basis also holds. Since the hypothesis is false, we can again conclude that the if-then statement is true. Again, the proof separates into four parts. Thus, n is odd. Inspecting the automaton of Fig. The reader should be able to construct this part of the proof easily. An alphabet is a nite, nonempty set of symbols. Common alphabets include: A string or sometimes word is a nite sequence of symbols chosen from some alphabet.

The string is another string chosen from this alphabet. The empty string is the string with zero occurrences of symbols. It is often useful to classify strings by their length, that is, the number of positions for symbols in the string. For instance, has length 5. Thus, there are only two symbols, 0 and 1, in the string , but there are ve positions for symbols, and its length is 5. The standard notation for the length of a string w is jwj.

The former is an alphabet its members 0 and 1 are symbols. The latter is a set of strings. Put another way,. Thus, two appropriate equivalences are:. Type Convention for Symbols and Strings Commonly, we shall use lower-case letters at the beginning of the alphabet or digits to denote symbols, and lower-case letters near the end of the alphabet, typically w, x, y, and z , to denote strings. You should try to get used to this convention, to help remind you of the types of the elements being discussed.

Let x and y be strings. Then xy denotes the concatenation of x and y, that is, the string formed by making a copy of x and following it by a copy of y. However, common languages can be viewed as sets of strings.

An example is English, where the collection of legal English words is a set of strings over the alphabet that consists of all the letters. Another example is C, or any other programming language, where the legal programs are a subset of the possible strings that can be formed from the alphabet of the language.

However, there are also many other languages that appear when we study automata. Some are abstract examples, such as:. The set of strings of 0's and 1's with an equal number of each:. The set of binary numbers whose value is a prime:. The only important constraint on what can be a language is that all alphabets are nite.

Thus languages, although they can have an in nite number of strings, are restricted to consist of strings drawn from one xed, nite alphabet. In automata theory, a problem is the question of deciding whether a given string is a member of some particular language.

For some strings, this decision is easy. For instance, cannot be the representation of a prime, for the simple reason that every integer except 0 has a binary representation that begins with 1. However, it is less obvious whether the string belongs to Lp , so any solution to this problem will have to use signi cant computational resources of some kind: For instance, the task of the parser in a C compiler. It is also common to replace w by some expression with parameters and describe the strings in the language by stating conditions on the parameters.

Here are some examples the rst with parameter n, the second with parameters i and j:. Notice that, as with alphabets, we can raise a single symbol to a power n in order to represent n copies of that symbol.

This language consists of strings with some 0's possibly none followed by at least as many 1's. However, the parser does more than decide. It produces a parse tree, entries in a symbol table and perhaps more. In this theory, we are interested in proving lower bounds on the complexity of certain problems. Especially important are techniques for proving that certain problems cannot be solved in an amount of time that is less than exponential in the size of their input.

Is It a Language or a Problem? Languages and problems are really the same thing. Which term we prefer to use depends on our point of view. When we care only about strings for their own sake, e. In those cases, where we care more about the thing represented by the string than the string itself, we shall tend to think of a set of strings as a problem. That is, if we can prove it is hard to decide whether a given string belongs to the language LX of valid strings in programming language X , then it stands to reason that it will not be easier to translate programs in language X to object code.

For if it were easy to generate code, then we could run the translator, and conclude that the input was a valid member of LX exactly when the translator succeeded in producing object code. We thus contradict the assumption that testing membership in LX is hard. It is an essential tool in the study of the complexity of problems, and it is facilitated greatly by our notion that problems are questions about membership in a language, rather than more general kinds of questions.

Finite Automata: Finite automata involve states and transitions among states in response to inputs. Regular Expressions: These are a structural notation for describing the same patterns that can be represented by nite automata.

They are used in many common types of software, including tools to search for patterns in text or in le names, for instance. Context-Free Grammars: These are an important notation for describing the structure of programming languages and related sets of strings they are used to build the parser component of a compiler. Turing Machines: These are automata that model the power of real computers. They allow us to study decidabilty, the question of what can or cannot be done by a computer.

They also let us distinguish tractable problems those that can be solved in polynomial time from the intractable problems those that cannot.

Deductive Proofs: This basic method of proof proceeds by listing statements that are either given to be true, or that follow logically from some of the previous statements. Proving If-Then Statements: Deductive proofs of if-then statements begin with the hypothesis, and continue with statements that follow logically from the hypothesis and previous statements, until the conclusion is proved as one of the statements.

Proving the Contrapositive: Proof by Contradiction: Sometimes we are asked to show that a certain statement is not true.

If the statement has one or more parameters, then we can show it is false as a generality by providing just one counterexample, that is, one assignment of values to the parameters that makes the statement false. Inductive Proofs: A statement that has an integer parameter n can often be proved by induction on n. We prove the statement is true for the basis, a nite number of cases for particular values of n, and then prove the inductive step: Structural Inductions: In some situations, including many in this book, the theorem to be proved inductively is about some recursively de ned construct, such as trees.

We may prove a theorem about the constructed objects by induction on the number of steps used in its construction. This type of induction is referred to as structural. An alphabet is any nite set of symbols. A string is a nite-length sequence of symbols. Languages and Problems: A language is a possibly in nite set of strings, all of which choose their symbols from some one alphabet. When the strings of a language are to be interpreted in some way, the question of whether a string is in the language is sometimes called a problem.

Each of these problems is worked like conventional homework. The Gradiance system gives you four choices that sample your knowledge of the solution. If you make the wrong choice, you are given a hint or advice and encouraged to try the same problem again.

Problem 1. Suppose we want to prove the statement S n: What is the concatenation of X and Y? The exception is the problem of nding palindromes, which are strings that are identical when reversed, like , regardless of their numerical value.

Aho and J. After an extended example that will provide motivation for the study to follow, we de ne nite automata formally.

We conclude the chapter with a study of an extended nondeterministic automaton that has the additional choice of making a transition from one state to another spontaneously, i.

However, we shall nd them quite important in Chapter 3, when we study regular expressions and their equivalence to automata. The study of the regular languages continues in Chapter 3. There, we introduce another important way to describe regular languages: After discussing regular expressions, and showing their equivalence to nite automata, we use both automata and regular expressions as tools in Chapter 4 to show certain important properties of the regular languages.

The latter are algorithms to answer questions about automata or regular expressions, e. The seller must know that the le has not been forged, nor has it been copied and sent to the seller, while the customer retains a copy of the same le to spend again. The nonforgeability of the le is something that must be assured by a bank and by a cryptography policy. However, the bank has a second important job: However, in order to use electronic money, protocols need to be devised to allow the manipulation of the money in a variety of ways that the users want.

Because monetary systems always invite fraud, we must verify whatever policy we adopt regarding how money is used. In the balance of this section, we shall introduce a very simple example of a bad electronic-money protocol, model it with nite automata, and show how constructions on automata can be used to verify protocols or, in this case, to discover that the protocol has a bug.

There are three participants: The customer may decide to transfer this money le to the store, which will then redeem the le from the bank i.

In addition, the customer has the option to cancel the le. That is, the customer may ask the bank to place the money back in the customer's account, making the money. Interaction among the three participants is thus limited to ve events: The customer may decide to pay. That is, the customer sends the money to the store. The customer may decide to cancel. The money is sent to the bank with a message that the value of the money is to be added to the customer's bank account.

The store may ship goods to the customer. The store may redeem the money. That is, the money is sent to the bank with a request that its value be given to the store. The bank may transfer the money by creating a new, suitably encrypted money le and sending it to the store. The three participants must design their behaviors carefully, or the wrong things may happen.

In our example, we make the reasonable assumption that the customer cannot be relied upon to act responsibly. In particular, it must make sure that two stores cannot both redeem the same money le, and it must not allow money to be both canceled and redeemed. The store should be careful as well. In particular, it should not ship goods until it is sure it has been given valid money for the goods. Protocols of this type can be represented as nite automata. Each state represents a situation that one of the participants could be in.

Transitions between states occur when one of the ve events described above occur. It turns out that what is important about the problem is what sequences of events can happen, not who is allowed to initiate them. Figure 2.

The bank does not know that the money has been sent by the customer to the store it discovers that fact only when the store executes the action redeem. Let us examine rst the automaton c for the bank. The start state is state 1 it represents the situation where the bank has issued the money le in question but has not been requested either to redeem it or to cancel it. Finite automata representing a customer, a store, and a bank cancel request is sent to the bank by the customer, then the bank restores the money to the customer's account and enters state 2.

The latter state represents the situation where the money has been cancelled. The bank, being responsible, will not leave state 2 once it is entered, since the bank must not allow the same money to be cancelled again or spent by the customer.

If so, it goes to state 3, and shortly sends the store a transfer message, with a new money le that now belongs to the store. After sending the transfer message, the bank goes to state 4. In that state, it will neither accept cancel or redeem requests nor will it perform any other actions regarding this particular money le.

Now, let us consider Fig. While the bank always does the right thing, the store's system has some defects. Imagine that the shipping and nancial operations are done by separate processes, so there is the opportunity for the ship action to be done either before, after, or during the redemption of the electronic money.

That policy allows the store to get into a situation where it has already shipped the goods and then nds out the money was bogus. The store starts out in state a. The bank will in fact be running the same protocol with a large number of electronic pieces of money, but the workings of the protocol are the same for each of them, so we can discuss the problem as if there were only one piece of electronic money in existence. In this state, the store begins both the shipping and redemption processes.

If the goods are shipped rst, then the store enters state c, where it must still redeem the money from the bank and receive the transfer of an equivalent money le from the bank. Alternatively, the store may send the redeem message rst, entering state d. From state d, the store might next ship, entering state e, or it might next receive the transfer of money from the bank, entering state f. From state f , we expect that the store will eventually ship, putting the store in state g, where the transaction is complete and nothing more will happen.

In state e, the store is waiting for the transfer from the bank. Unfortunately, the goods have already been shipped, and if the transfer never occurs, the store is out of luck. Last, observe the automaton for the customer, Fig.

While the three automata of Fig. However, in the formal de nition of a nite automaton, which we shall study in Section 2. Thus, the automaton for the store needs an additional arc from each state to itself, labeled cancel. Another potential problem is that one of the participants may, intentionally or erroneously, send an unexpected message, and we do not want this action to cause one of the automata to die.

For instance, suppose the customer decided to execute the pay action a second time, while the store was in state e. Since that state has no arc out with label pay, the store's automaton would die before it could receive the transfer from the bank. In summary, we must add to the automata of Fig. The two kinds of actions that must be ignored are: Actions that are irrelevant to the participant involved. As we saw, the only irrelevant action for the store is cancel, so each of its seven states.

The complete sets of transitions for the three automata has a loop labeled cancel. For the bank, both pay and ship are irrelevant, so we have put at each of the bank's states an arc labeled pay, ship.

For the customer, ship, redeem and transfer are all irrelevant, so we add arcs with these labels. Of course, the customer is still a participant, since it is the customer who initiates the pay and cancel actions. However, as we mentioned, the matter of who initiates actions has nothing to do with the behavior of the automata. Actions that must not be allowed to kill an automaton. As mentioned, we must not allow the customer to kill the store's automaton by executing pay again, so we have added loops with label pay to all but state a where the pay action is expected and relevant.

We have also added loops with labels cancel to states 3 and 4 of the bank, in order to prevent the customer from killing the bank's automaton by trying to cancel money that has already been redeemed.

The bank properly ignores such a request. Likewise, states 3 and 4 have loops on redeem. The store should not try to redeem the same money twice, but if it does, the bank properly ignores the second request. While we now have models for how the three participants behave, we do not yet have a representation for the interaction of the three participants.

As mentioned, because the customer has no constraints on behavior, that automaton has only one state, and any sequence of events lets it stay in that state i. However, both the store and bank behave in a complex way, and it is not immediately obvious in what combinations of states these two automata can be. The normal way to explore the interaction of automata such as these is to construct the product automaton.

That automaton's states represent a pair of states, one from the store and one from the bank. We show the product automaton in Fig. For clarity, we have arranged the 28 states in an array. The row corresponds to the state of the bank and the column to the state of the store. To save space, we have also abbreviated the labels on the arcs, with P , S , C , R, and T standing for pay, ship, cancel, redeem, and transfer, respectively.

However, it is important to notice that if an input action is received, and one of the two. That state corresponds to the situation where the bank is in state i and the store in state x. Let Z be one of the input actions. We look at the automaton for the bank, and see whether there is a transition out of state i with label Z.

Suppose there is, and it leads to state j which might be the same as i if the bank loops on input Z. Then, we look at the store and see if there is an arc labeled Z leading to some state y. We can now see how the arcs of Fig. For instance, on input pay, the store goes from state a to b, but stays put if it is in any other state besides a. The bank stays in whatever state it is in when the input is pay, because that action is irrelevant to the bank.

This observation explains the four arcs labeled P at the left ends of the four rows in Fig. For another example of how the arcs are selected, consider the input redeem. If the bank receives a redeem message when in state 1, it goes to state 3. If in states 3 or 4, it stays there, while in state 2 the bank automaton dies i. The store, on the other hand, can make transitions from state b to d or from c to e when the redeem input is received.

In Fig. Inaccessible states need not be included in the automaton, and we did so in this example just to be systematic.

That is, can the product automaton get into a state in which the store has shipped that is, the state is. In terms of what the bank is doing, once it has gotten to state 3, it has received the redeem request and processed it.

That means it must have been in state 1 before receiving the redeem and therefore the cancel message had not been received and will be ignored if received in the future. Thus, the bank will eventually perform the transfer of money to the store. The state is accessible, but the only arc out leads back to that state.

This state corresponds to a situation where the bank received a cancel message before a redeem message. However, the store received a pay message i. The store foolishly shipped before trying to redeem the money, and when the store does execute the redeem action, the bank will not even acknowledge the message, because it is in state 2, where it has canceled the money and will not process a redeem request.

We begin by introducing the formalism of a deterministic nite automaton, one that is in a single state after reading any sequence of inputs. A nite set of states, often denoted Q. A transition function that takes as arguments a state and an input symbol and returns a state. If q is a state, and a is an. A start state, one of the states in Q. The set F is a subset of Q.

If You're an Educator

A deterministic nite automaton will often be referred to by its acronym: The most succinct representation of a DFA is a listing of the ve components above. We start out with the DFA in its start state, q0. Let us formally specify a DFA that accepts all and only the strings of 0's and 1's that have the sequence 01 somewhere in the string. We can write this language L as: What do we know about an automaton that can accept this language L?

It has some set of states, Q, of which one, say q0 , is the start state. This automaton has to remember the important facts about what inputs it has seen so far. To decide whether 01 is a substring of the input, A needs to remember: Has it already seen 01? If so, then it accepts every sequence of further inputs i. Has it never seen 01, but its most recent input was 0, so if it now sees a 1, it will have seen 01 and can accept everything it sees from here on?

Has it never seen 01, but its last input was either nonexistent it just started or it last saw a 1? In this case, A cannot accept until it rst sees a 0 and then sees a 1 immediately after. These three conditions can each be represented by a state. Condition 3 is represented by the start state, q0. Surely, when just starting, we need to see a 0 and then a 1.

But if in state q0 we next see a 1, then we are no closer to seeing 01, and so we must stay in state q0. However, if we are in state q0 and we next see a 0, we are in condition 2. That is, we have never seen 01, but we have our 0. Thus, let us use q2 to represent condition 2. Now, let us consider the transitions from state q2. We have not seen 01, but 0 was the last symbol, so we are still waiting for a 1.

If we are in state q2 and we see a 1 input, we now know there is a 0 followed by a 1. We can go to an accepting state, which we shall call q1 , and which corresponds to condition 1 above.

Finally, we must design the transitions for state q1. In this state, we have already seen a 01 sequence, so regardless of what happens, we shall still be in a situation where we've seen The complete speci cation of the automaton A that accepts the language L of strings that have a 01 substring, is.

There are two preferred notations for describing automata: A transition diagram, which is a graph such as the ones we saw in Section 2. Then the transition diagram has an arc from node q to node p, labeled a. If there are several input symbols that cause transitions from q to p, then the transition diagram can have one arc, labeled by the list of these symbols.

This arrow does not originate at any node. States not in F have a single circle. Example 2. We see in that diagram the three nodes that correspond to the three states. There is a Start arrow entering the start state, q0 , and the one accepting state, q1 , is represented by a double circle. Out of each state is one arc labeled 0 and one arc labeled 1 although the two arcs are combined into one with a double label in the case of q1.

The rows of the table correspond to the states, and the columns correspond to the inputs. We have also shown two other features of a transition table. The start state is marked with an arrow, and the accepting states are marked with a star.

Since we can deduce the sets of states and input symbols by looking at the row and column heads, we can now read from. We have explained informally that the DFA de nes a language: In terms of the transition diagram, the language of a DFA is the set of labels along all the paths that lead from the start state to any accepting state.

Now, we need to make the notion of the language of a DFA precise. To do so, we de ne an extended transition function that describes what happens when we start in any state and follow any sequence of inputs. The extended transition function is a function that takes a state q and a string w and returns a state p the state that the automaton reaches when starting in state q and processing the sequence of inputs w.

That is, if we are in state q and read no inputs, then we are still in state q.

Suppose w is a string of the form xa that is, a is the last symbol of w, and x is the string consisting of all but the last symbol. It should not be surprising that the job of the states of this DFA is to count both the number of 0's and the number of 1's, but count them modulo 2.

That is, the state is used to remember whether the number of 0's seen so far is even or odd, and also to remember whether the number of 1's seen so far is even or odd. There are thus four states, which can be given the following interpretations:. Both the number of 0's seen so far and the number of 1's seen so far are even. The number of 0's seen so far is even, but the number of 1's seen so far is odd.

The number of 1's seen so far is even, but the number of 0's seen so far is odd. Both the number of 0's seen so far and the number of 1's seen so far are odd. State q0 is both the start state and the lone accepting state. It is the start state, because before reading any inputs, the numbers of 0's and 1's seen so far are both zero, and zero is even.

It is the only accepting state, because it describes exactly the condition for a sequence of 0's and 1's to be in language L. Transition diagram for the DFA of Example 2. It is.

Introduction to Automata Theory, Languages, and Computation, 3rd Edition

Notice how each input 0 causes the state to cross the horizontal, dashed line. Thus, after seeing an even number of 0's we are always above the line, in state. Likewise, every 1 causes the state to cross the vertical, dashed line. Thus, after seeing an even number of 1's, we are always to the left, in state q0 or q2 , while after seeing an odd number of 1's we are to the right, in state q1 or q3.

These observations are an informal proof that the four states have the interpretations attributed to them. However, one could prove the correctness of our claims about the states formally, by a mutual induction in the spirit of Example 1. We can also represent this DFA by a transition table. Suppose the input is Since this string has even numbers of 0's and 1's both, we expect it is in the language.

Let us now verify that claim. Transition table for the DFA of Example 2. The summary of this calculation is:. We tend to use the same variables to denote the same thing across all examples, because it helps to remind you of the types of variables, much the way a variable i in a program is almost always of integer type.

However, we are free to call the components of an automaton, or anything else, anything we wish. For example, the DFA's of Examples 2. However, the two transition functions are each local variables, belonging only to their examples.

L A is the set of all strings of 0's and 1's that contain a substring Whenever a marble encounters a lever, it causes the lever to reverse after the marble passes, so the next marble will take the opposite branch.

Let acceptance correspond to the marble exiting at D nonacceptance represents a marble exiting at C. How would your answers to parts a and b change? Exercise 2. Perform an induction on jyj. Use Exercise 2. For example, strings , , and are in the language 0, , and are not.

Examples of strings in the language are 0, , , and Informally describe the language accepted by this DFA, and prove by induction on the length of an input string that your description is correct. When setting up the inductive hypothesis, it is wise to make a statement about what inputs get you to each state, not just what inputs get you to the accepting state. For instance, when the automaton is used to search for certain sequences of characters e.

We shall see an example of this type of application in Section 2. Before examining applications, we need to de ne nondeterministic nite automata and show that each one accepts a language that is also accepted by some DFA. However, there are reasons to think about NFA's. They are often more succinct and easier to design than DFA's. Like the DFA, an NFA has a nite set of states, a nite set of input symbols, one start state and a set of accepting states.

We shall start with an example of an NFA, and then make the de nitions precise. It is always possible that the next symbol does not begin the nal 01, even if that symbol is 0. Thus, state q0 may transition to itself on both 0 and 1. However, if the next symbol is 0, this NFA also guesses that the nal 01 has begun. An arc labeled 0 thus leads from q0 to state q1. Notice that there are.

An NFA accepting all strings that end in 01 two arcs labeled 0 out of q0. The NFA has the option of going either to q0 or to q1 , and in fact it does both, as we shall see when we make the de nitions precise.

In state q1 , the NFA checks that the next symbol is 1, and if so, it goes to state q2 and accepts. Notice that there is no arc out of q1 labeled 0, and there are no arcs at all out of q2. The states an NFA is in during the processing of input sequence Figure 2. We have shown what happens when the automaton of Fig. It starts in only its start state, q0. When the rst 0 is read, the NFA may go to either state q0 or state q1 , so it does both.

These two threads are suggested by the second column in Fig. Then, the second 0 is read. State q0 may again go to both q0 and q1. We nd that q0 goes only to q0 on 1, while q1 goes only to q2. Thus, after reading , the NFA is in states q0 and q2.

Since q2 is an accepting state, the NFA accepts However, the input is not nished. The fourth input, a 0, causes q2 's thread to die, while q0 goes to both q0 and q1. The last input, a 1, sends q0 to q0 and q1 to q2. Since we are again in an accepting state, is accepted. Q is a nite set of states. The NFA of Fig. The idea was suggested by Fig. For instance, Fig. That is, without reading any input symbols, we are only in the state we began in.

A summary of the steps is: Line 1 is the basis rule. Lines 5 and 6 are similar to lines 3 and 4. As we have suggested, an NFA accepts a string w if it is possible to make any sequence of choices of next state, while reading the characters of w, and go from the start state to any accepting state. The fact that other choices using the input symbols of w lead to a nonaccepting state, or do not lead to any state at all i.

As an example, let us prove formally that the NFA of Fig. The proof is a mutual induction of the following three statements that characterize the three states: To prove these statements, we need to consider how A can reach each state i.

The proof of the theorem is an induction on jwj, the length of w, starting with length 0. Thus, the hypotheses of both directions of the if-and-only-if statement are false, and therefore both directions of the statement are true.

We may assume statements 1 through 3 hold for x, and we need to prove them for w. If Assume that w ends in 0 i. If we look at the diagram of Fig. If Assume that w ends in Looking at the diagram of Fig. By statement 2 applied to x, we know that x ends in 0. Thus, w ends in 01, and we have proved statement 3. In the worst case, however, the smallest DFA can have 2n states while the smallest NFA for the same language has only n states.

In general, many proofs about automata involve constructing one automaton from another. It is important for us to observe the subset construction as an example of how one formally describes one automaton in terms of the states and transitions of another, without knowing the speci cs of the latter automaton.

Notice that the input alphabets of the two automata are the same, and the start state of D is the set containing only the start state of N. The other components of D are constructed as follows. Note that if QN has n states, then QD will have 2 states. Often, not all these states are accessible from the start state of QD. Inaccessible states can. That is, FD is all sets of N 's states that include at least one accepting state of N.

Notice that this transition table belongs to a deterministic nite automaton. Even though the entries in the table are sets, the states of the constructed DFA are sets. To make the point clearer, we can invent new names for these states, e.

The DFA transition table of Fig 2. Of the eight states in Fig. The other ve states are inaccessible from the start state and may as well not be there. We know for certain that the singleton set consisting only of N 's start state is accessible.

Suppose we have determined that set S of states is accessible. For the example at hand, we know that fq0 g is a state of the DFA D.

Both these facts are established by looking at the transition diagram of Fig. We thus have one row of the transition table for the DFA: For instance, to see the latter calculation, we know that.

We now have the fth row of Fig. Thus, the subset construction has converged we know all the accessible states and their transitions. The entire DFA is shown in Fig.

Intro To Automata Theory, Languages And Computation John E Hopcroft, Jeffrey D Ullman

Notice that it has only three states, which is, by coincidence, exactly the same number of states as the NFA of Fig. However, the DFA of Fig. We need to show formally that the subset construction works, although the intuition was suggested by the examples.

After reading sequence of input.

Since the accepting states of the DFA are those sets that include at least one accepting state of the NFA, and the NFA also accepts if it gets into at least one of its accepting states, we may then conclude that the DFA and NFA accept exactly the same strings, and therefore accept the same language. Theorem 2. Now, let us use 2. Put intuitively, if we have the transition diagram for a DFA, we can also interpret it as the transition diagram of an NFA, which happens to have exactly one choice of transition in any situation.

We leave the proof to the reader. As a consequence, w is accepted by D if and only if it is accepted by N i. In Example 2. As we mentioned, it is quite common in practice for the DFA to have roughly the same number of states as the NFA from which it is constructed. However, exponential growth in the number of states is possible all the 2n DFA states that we could construct from an n-state NFA may turn out to be accessible. Intuitively, a DFA D that accepts this language must remember the last n symbols it has read.

Since any of 2n subsets of the last n symbols could have been 1, if D has fewer. There is a state q0 that the NFA is always in, regardless of what inputs have been read. From state q1 , any input takes N to q2 , the next input takes it to q3 , and so on, until n ; 1 inputs later, it is in the accepting state qn. The formal statement of what the states of N do is: N is in state q0 after reading any sequence of inputs w. We shall not prove these statements formally the proof is an easy induction on jwj, mimicking Example 2.

That says N is in state qn if and only if the nth symbol from the end is 1. But qn is the only accepting state, so that condition also characterizes exactly the set of strings accepted by N. The Pigeonhole Principle In Example 2. Since there are fewer states than sequences, one state must be assigned two sequences. The pigeonhole principle may appear obvious, but it actually depends on the number of pigeonholes being nite.

Thus, it works for nite-state automata, with the states as pigeonholes, but does not apply to other kinds of automata that have an in nite number of states. Then each of the in nite number of pigeons gets a pigeonhole, and no two pigeons have to share a pigeonhole.

Customers who bought this item also bought

Give nondeterministic nite automata to accept the following languages. Try to take advantage of nondeterminism as much as possible. For instance, observe the automaton of Fig. Technically, this automaton is not a DFA, because it lacks transitions on most symbols from each of its states. However, such an automaton is an NFA. If we use the subset construction to convert it to a DFA, the automaton looks almost the same, but it includes a dead state, that is, a nonaccepting state that goes to itself on every possible input symbol.

In general, we can add a dead state to any automaton that has no more than one transition for any state and input symbol. Then, add a transition to the dead state from each other state q, on all input symbols for which q has no other transition. The result will be a DFA in the strict sense. Thus, we shall sometimes refer to an automaton as a DFA if it has at most one transition out of any state on any symbol, rather than if it has exactly one transition. Note that 0 is an allowable multiple of 4.

In the only-if portion of Theorem 2. Supply this proof. Prove this contention. Prove this claim. A common problem in the age of the Web and other on-line text repositories is the following. Given a set of words, nd all documents that contain one or all of those words. A search engine is a popular example of this process. Machines with very large amounts of main memory keep the most common of these lists available, allowing many people to search for documents at once.

Inverted-index techniques do not make use of nite automata, but they also take very large amounts of time for crawlers to copy the Web and set up the indexes. There are a number of related applications that are unsuited for inverted indexes, but are good applications for automaton-based techniques. The characteristics that make an application suitable for searches that use automata are: The repository on which the search is conducted is rapidly changing.

For example: For example, a nancial analyst might search for certain stock ticker symbols or names of companies. The robot will retrieve current catalog pages from the Web and then search those pages for words that suggest a price for a particular item.

The documents to be searched cannot be cataloged. For example, site. Suppose we are given a set of words, which we shall call the keywords, and we want to nd occurrences of any of these words. In applications such as these, a useful way to proceed is to design a nondeterministic nite automaton, which signals, by entering an accepting state, that it has seen one of the keywords.

The text of a document is fed, one character at a time to this NFA, which then recognizes occurrences of the keywords in this text. There is a simple form to an NFA that recognizes a set of keywords.

There is a start state with a transition to itself on every input symbol, e. There is a transition from the start state to q1 on symbol a1 , a transition from q1 to q2 on symbol a2 , and so on. The transition diagram for the NFA designed using the rules above is in Fig. States 2 through 4 have the job of recognizing web, while states 5 through 8 recognize site. We have two major choices for an implementation of this NFA.

Write a program that simulates this NFA by computing the set of states it is in after reading each input symbol. The simulation was suggested in Fig. Then simulate the DFA directly. Some text-processing programs, such as advanced forms of the UNIX grep command egrep and fgrep actually use a mixture of these two approaches.

However, for our purposes, conversion to a DFA is easy and is guaranteed not to increase the number of states. We can apply the subset construction to any NFA. However, when we apply that construction to an NFA that was designed from a set of keywords, according to the strategy of Section 2. Since in the worst case the number of states exponentiates as we go to the DFA, this observation is good news and explains why the method of designing an NFA for keywords and then constructing a DFA from it is used frequently.

The rules for constructing the set of DFA states is as follows. For example, if two of the keywords begin with the same letter, say a, then the two NFA states that are reached from q0 by an arc labeled a will yield the same set of NFA states and thus get merged in the DFA. Each of the states of the DFA is located in the same position as the state p from which it is derived using rule b above. This state was constructed from state 3. It includes the start state, 1, because every set of the DFA states does.

The transitions for each of the DFA states may be calculated according to the subset construction. However, the rule is simple.

Conversion of the NFA from Fig. On all symbols x such that there are no transitions out of any of the pi 's on symbol x, let this DFA state have a transition on x to that state of the DFA consisting of q0 and all states that are reached from q0 in the NFA following an arc labeled x. For instance, consider state of Fig. Therefore, on symbol b, goes to On symbol e, there are no transitions of the NFA out of 3 or 5, but there is a transition from 1 to 5. Thus, in the DFA, goes to 15 on input e.

Similarly, on input w, goes to On every other symbol x, there are no transitions out of 3 or 5, and state 1 goes only to itself. Like the nondeterminism added in Section 2. In the examples to follow, think of the automaton as accepting those sequences of labels along paths from the start state to an accepting state.

Either this string of digits, or the string 2 can be empty, but at least one of the two strings of digits must be nonempty. Thus, state q1 represents the situation in which we have seen the sign if there is one, and perhaps some digits, but not the decimal point.

State q2 represents the situation where we have just seen the decimal point, and may or may not have seen prior digits. In q4 we have de nitely seen at least one digit, but not the decimal point. We introduce context-free grammars, as they are usually called, in Chapter 5. Regular Expressions also denote the structure of data, especially text strings. As we shall see in Chapter 3, the patterns of strings they describe are exactly the same as what can be described by nite automata.

This expression represents patterns in text that could be a city and state, e. Parentheses are used to group components of the expression they do not represent characters of the text described. As we mentioned in the introduction to the chapter, there are two important issues: 1. What can a computer do at all? The subject is studied in Chapter While geometry has its practical side e.

In the USA of the 's it became popular to teach proof as a matter of personal feelings about the statement. While it is good to feel the truth of a statement you need to use, important techniques of proof are no longer mastered in high school.

Yet proof is something that every computer scientist needs to understand. Some computer scientists take the extreme view that a formal proof of the correctness of a program should go hand-in-hand with the writing of the program itself. We doubt that doing so is productive. On the other hand, there are those who say that proof has no place in the discipline of programming. Our position is between these two extremes. Testing programs is surely essential.

However, testing goes only so far, since you cannot try your program on every input. When your testing tells you the code is incorrect, you still need to get it right. To make your iteration or recursion correct, you need to set up an inductive hypothesis, and it is helpful to reason, formally or informally, that the hypothesis is consistent with the iteration or recursion.

This process of understanding the workings of a correct program is essentially the same as the process of proving theorems by induction.

Thus, in addition to giving you models that are useful for certain types of software, it has become traditional for a course on automata theory to cover methodologies of formal proof. Each step in the proof must follow, by some accepted logical principle, from either the given facts, or some of the previous statements in the deductive proof, or a combination of these. The hypothesis may be true or false, typically depending on values of its parameters.When setting up the inductive hypothesis, it is wise to make a statement about what inputs get you to each state, not just what inputs get you to the accepting state.

Similarly, we group two of the same operators from the left in arithmetic, so x ; y ; z is equivalent to x ; y ; z , and not to x ; y ; z. These are automata that model the power of real computers. In general, we construct a complete sequence of states for each keyword, as if it were the only word the automaton needed to recognize.

For example: