VKI Studios is now Cardinal Path! www.CardinalPath.com
Learn more about Cardinal Path

Regular Expression to determine if a base 10 number is divisible by 3

Original posted on Erik Vold's Blog

The Problem:

For the regular language L = { w | w mod 3 = 0 }, where the alphabet is {0,1,2,3,4,5,6,7,8,9}; give the deterministic finite automaton (DFA) for L, and convert this to a regular expression.

The Solution:

The DFA ( S, Σ, T, s, A ):
S = {q0,q1,q2}
Σ = {0,1,2,3,4,5,6,7,8,9}
T = (doing the state diagram below)
s = {q0}
A = {q0}

For shorthand I will divide the alphabet, Σ, into:

  • A={0,3,6,9}
  • B={1,4,7}
  • C={2,5,8}

The state diagram:

DFA State Diagram

Now to convert the DFA state diagram into a regular expression. This is done by converting the DFA into generalized non deterministic finite automaton (GNFA), and then converting the GNFA into a regular expression.

Conversion of DFA into GNFA, and removal of q0 state

Notice in the above that I did two steps in one; I first converted the DFA into a GNFA (which is the easy part), then I removed the q0 state.

Removing the q1 state:

Removing the q1 state

Finally, removing the q2 state:

Removing the q2 state

Therefore the regular expression that defines the regular language L is:
(A+)∪((B∪A*B)(A∪CA*B)*CA*)∪((C∪A*C∪(B∪A*B)(A∪CA*B)*(B∪CA*C))(A∪BA*C∪(C∪BA*B)(A∪CA*B)*(B∪CA*C))*(BA*∪(C∪BA*B)(A∪CA*B)*CA*))

For further reading please see "Introduction to the Theory of Computation" by Michael Sipser

Comments (Comment Moderation is enabled. Your comment will not appear until approved.)
I realize that you did not do the above computations, but I feel that I should point out that this regular expression does not function correctly. First of all, to convert a DFA into a regular expression, it is not necessary to convert to a GNFA. This makes the state removal steps easier and the resulting regular expression will be shorter (and still maintain correctness). There must have been some error in the simplification steps due to the introduced complexity.

If we first remove q1, you will come up with a DFA with the following state transitions (since I cannot draw the DFA here):
State Transition EndState
q0 A q0
q0 BA*C q0
q0 C q2
q0 BA*B q2
q2 A q2
q2 CA*B q2
q2 B q0
q2 CA*C q0

Removing state q2 will leave you with the regular expression:
(A|(BA*C)|(((BA*B)|C)(A|(CA*B))*(B|(CA*C))))*

If you want to ensure that the empty string doesn't match, change the '*' at the end to a '+'.

Test the regex that was the proposed solution and you will find that it fails after some time. The above regex should pass for all decimal numbers.

For reference:
A = [0369]
B = [147]
C = [258]
# Posted By Tyler McClung | 8/17/10 2:24 PM
.