![]() ![]() Once you’re happy, don’t forget to save the SDG project, so you can reuse it later. Listing 3 shows the complete reverse regex expression that gives a reasonable attempt at a UK first-line address: The |||| expressions are added to return empty strings of course, because we don’t always want these and the ? qualifier only reduces the chance to 50%. Here, we have the occasional qualifier of North, East, Lower, and so on. (North |East |West |South |Lower ||||)(Manor|Sebastobol|Wellington|High|Vicarage|Rectory|Cowley|Clarendon|New|Old|Milton) (Way|Street|St.|Avenue|Road|Parkway|Close|Drive|Lane|Park|Mews) This will append ‘a’ or ‘b’ to the end of the number 50% of the time, and nothing the other 50%. We’ll fix that in a moment, but first, we ought perhaps to put in something extra for when a house is divided up into flats: However, this means that every address would have a house number, whereas some houses have only names. If we put in a first character between 1 and 9, and maybe a second one and even maybe a third, between 0 and 9 that would give a better range: There is no zero, but if we put one in, there is a chance of a leading zero and this is, you’ll remember, a string. This means “give me a numeric digit between 1 and 9, between none and three times”. We can see if we can do better! Customizing the first-line address using ‘reverse regex’ expressions So that was what SQL Data Generator attempted. In each case, you get an equal chance of the listed alternatives. (Way|Street|St.|Avenue|Road|Parkway|Freeway|Drive|Boulevard|Blvd.).(Nobel|Fabien|Hague|Oak|Second|First|Cowley|Clarendon|New|Old|Milton).The last two expressions are so similar that they can be described together. ![]() Why all the twelve ‘nothings’ (empty strings)? This is to skew the distribution so that there is only a 25 percent chance of having a word returned at all, and this has an equal chance of being North, East, West or South.įollowing this expression is one that is very similar:įour chances in seven, you get nothing, otherwise it is ‘Green’, ‘White’ or ‘Rocky’. This means “return the word North or East or West or South or nothing or nothing or … etc”. If you scan the AddressLine1 column in Figure 2, example, you’ll see that is what we’ve got. So the expressions as a whole, (?), will generate a minimum or 2 digits, a maximum of 3, the first between 1 and 9, the second being between 0 and 9, and maybe a third digit between 0 and 9. For example, (Yes|No) means “use either Yes or No”. You must do this for any regex expression. ? means use a number between 0 and 9 or nothing.This is similar to the LIKE expression in SQL Server. Those square brackets are to denote a collection of characters and the ‘-‘ sign means ‘and all the ASCII characters between the two in the ASCII sequence’. Let’s tackle this reverse regex piece-by-piece, starting with the first expression, which generates the house number: | Avenue | Road | Parkway | Freeway | Drive | Boulevard | Blvd. SQL Data Generator out-of-the-boxįor demonstrating the features of SDG in this article, all we need is a customer address table, which you can recreate using the following code: As an example of what’s possible, in this article I’ll show how to generate realistic UK addresses. With a little tweaking, and a basic grasp of ‘reverse regex’ expressions, a tool like SQL Data Generator (SDG) can do the job very well. How do you generate data that gives you realistic-looking addresses for your part of the world? However, you do need to develop and test the database with data that reflects reasonably accurately the nature of your customer data. You want to run some database tests but you can’t use real production data because it contains personal data, including customer names, addresses, and so on. He is a regular contributor to Simple Talk and SQLServerCentral. Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications.ĭespite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |