Black Box Recorders

I’m sure you all know about the black box records in planes that record information about how systems on the plane are performing, including the human system better known as the pilot.  When a plane has any type of malfunction ranging from minor items such as bad indicator lights to crashes, the first thing that the airline wants to recover after dealing with the passengers is the black box recorder.  (Yes, I just got done reading AirFrame by Michael Crichton.  An excellent book as always from Michael.)

But did you know you may have a black box recorder in your car?  If you bought a new car this year, you have about a 96% chance of already having a black box installed in your vehicle.  Considering the high percentage of new vehicles that already have this device installed, I have to wonder if the National Highway Traffic Safety Administration (NHTSA) has nothing better to do than to draft a new ruling that would require these black box devices (which they call EDRs or Event Data Recorders) to be installed in all new vehicles by fall of 2014.  Does that extra 4% really justifies additional interference by a government agency?  Or is there something else going on?

A lot of people are afraid that these devices could be used to monitor their driving all of the time and use it as a condition to obtaining or renewing their auto insurance.  Others may be afraid that the government is requiring these devices as  a first step to track where you are and where you have been.  Still others are afraid that it will be linked to some kind of tax that will be levied on you based on the number of miles that you drive your car and perhaps when and when you drive it.  Think of it as an opportunity tax on your ability to drive.

Of course the proponents say that the device is very limited and can only record a few seconds before and during a vehicle crash.  They imply that it is not in a continuous record mode and even if it were, it could not record more than a dozen seconds or so every dozen seconds or so, it starts at the beginning of the memory range to write over it with new data.  The proponents also say that the information collected during crashes could help the manufacturers better monitor the deployment of airbags and other safety devices in cars during real crashes.  I guess test car crashes have become too expensive so we are the new crash dummies to provide them with data.  They also try to assure those lobbying against these devices that they do not record conversations or the location of the vehicle.

While some states have already moved to insure the privacy of this data making it the property of the car owner and requiring the car owner to consent to it being used, it would seem that these devices are on a slippery slope.  As with all new technology, the benefits must be weighed against the risks.  In Spiderman, there is a quote, ‘With great power comes great responsibility.’  Well, maybe Stan Lee did not originate the quote and maybe it has been misquoted.  FDR had written in a speech that he never gave because he died the day before he was suppose to give it saying, ‘Today we have learned in the agony of war that great power involves great responsibility.’  There is also a quote in the Bible attributed to Jesus that said, ‘To whom much as been given, much will be expected.’

It seems that variations on that warning about power or authority goes back thousands of years.  Perhaps the reason why that is so lies in the fact that it is so often violated.  The ‘misuse of power’ or the quote that ‘power has gone to his head’ or ‘ power corrupts and absolute power corrupts absolutely’ also rings through the ages.  I suppose you could even say that the Emperor in the Star Wars saga is an example of absolute power corrupting absolutely.

So what does that have to do with a black box in your car.  Perhaps nothing.  Perhaps everything.  Like the laws in this country that are changed subtly over time by individual case judgments that slowly change the original intent of the law, you could see a slow change over time to the recording capabilities of these black boxes.  Each change will be proposed as absolutely sensible and for a good cause.  However, taken as a whole, it could change our relationships with our automobiles.  Will these devices in the future take over control of the vehicle if they sense a pending crash?  Will they be able to sense then the driver is not paying attention to the road because they are falling asleep or are too busy texting and begin reducing the speed of the vehicle?  Will they act to save lives or will they simply tattle on us to some central data collection system kept in the deep vaults under Area 51 where the aliens use to be kept?  Or are the aliens the ones wanting to collect the data in the first place?  Hmmm…  Wait there may be an idea for a science fiction book here.   Got to go.

C’ya later.

Honest Things That Can Get You Into Trouble Today

The other day I came across an article that reported on a supposed FBI and Dept of Justice flyer sent to businesses in 2012 to help spot “suspicious activity” so that it could be reported as potential terrorist activity.  Some of the ridiculous things mentioned I really have to question such as:

People who are over concerned about privacy and attempt to shield the screen from the view of others especially when entering their PIN numbers for their debit cards.  Honestly, I’d just as soon not use a debit card if I have to let people watch my fingers key in the PIN number.  Since when has privacy become bad?

But wait, if I want to pay cash, another item is to be aware of people who always pay cash or use credit card(s) in different names.  Now I don’t know how many businesses will remember that you came back to the store and used a credit card in different names.  However, what constitutes a different name?  On some of my credit cards I have my middle initial and on others I don’t.  Some have my name as Michael and other have it as Mike.  And when I don’t use a credit card I typically use cash.  You know, that thing called legal tender in the old United States of America that I grew up in and believed in.  OH NO!  They are probably on to me now.

Observed using multiple cell phones.  Guilty!  Since my wife died in March of this year, I carry both her cell phone and mine because people are still calling on her number and I need to get those calls and messages.  I guess that makes me a suspicious character also.

Being someplace “you don’t belong.”  I guess I really want a better definition of this one because it seems so arbitrary.

Having a missing hand or fingers.  Thankfully this is not my problem.  However my deceased father was missing a finger, lost during WWII fighting for the freedoms of this country and other freedom-loving countries.  He was a prisoner of war and barely made it out alive.  But he was missing a finger so I guess if he was alive today, he would be a suspect.

Wearing a backpack when the weather is warm.  Really officer, it just my computer from work that I carry back and forth from work to home and back again.  It is not the master control to some evil device invented by Khan Noonien Singh (Ok, that is a reference to Star Trek so don’t freak out on me.  I just saw the new movie over the weekend.).  Good thing our schools are going to digital curriculum so students don’t have to carry books back and forth between school and home in a backpack.  What’s that you say, they haven’t been carrying books home for years anyway?

Conducting financial transactions in bursts of activities within a short period of time, especially in previously dormant accounts.  WOW AM I IN TROUBLE NOW.  With having to handle the estates for both my father-in-law and my wife over the last few months I had to transfer accounts from banks to banks opening and closing accounts, some of which had no activity for years.

Of course this entire blog probably now falls under the ‘Making suspicious comments or anti-U.S. sentiments that appear to be out-of-place and provocative.’  I assure you that they are not out-of-place but very much in-place especially if common sense is being cast aside by blindly following rules.  In fact, I am probably more pro-U.S.  but with a greater degree of common sense and with my eyes wide open to what is happening to our freedoms.  A few months ago I wrote a blog entry called something like, ‘WWJD?’, which actually stood for, ‘What Would Jefferson Do?’  I wonder what he would say to these new guidelines.

Oh, a few final things that could get you in trouble:  Don’t communicate through a PC game or by using VOIP (does SKYPE count?  What about instant messaging?).  Don’t travel illogical distances to use Internet Cafes even if the local ones are crowded or cost more or have antiquated hardware or software or they have better food or cheaper coffee at the other location.  Don’t display an interest in remote-controlled aircraft.  (You know I have wondered about those people in malls selling remote controlled helicopters that they fly around inside the mall that nearly hit the people walking through the mall.)  And finally, don’t buy coffee with cash on a regular basis. HUH?  Coffee?  What if it is in the Internet Café?

C’ya next time.

Flashing Light Traffic Traps

I have one.  You have one.  Everyone has one.  Well, perhaps if you are from a very small town or live miles from any town, city, or metropolis.  Yes, I’m talking about that traffic light that turns green only long enough for one car to get through the intersection before turning red again.  I call these the flashing light traffic traps.  They are totally unpredictable, totally unreliable, and totally dangerous.

Mine is just down the street at the exit from the local high school.  Sometimes the light stays green long after all the cars have passed through the intersection and no additional cars are in sight while at the same time keeping people on the cross street stuck waiting for the light to change.  Other times, the light changes from green back to red before two cars already at the intersection can make it across the intersection.  As the third car in line slams on it’s breaks, I have to wonder how long it will be until someone not expecting such a ridiculously short light slams into the back of a car stopping at the quick change light.

So I have to ask, Is this short light a mistake?  Is a programming error in the sensors and timers that control the light?  Is it a way for the local police force to trap unsuspecting drivers who go through the red light?  Is it a coincidence that this light is right outside a high school with all of its young and less experienced drivers?  I have to say that this light has behaved like this for the last couple of years so I find it hard to believe that it is an accident.  But it is  really the accidents that I’m most worried about because I believe it is only a matter of time until hits a car that stopped as the light changed back to red or until someone not paying attention drives through the red because they were not expecting it to change so rapidly and they were not paying attention to the light because they did not expect it to change so quickly.

I’m very aware of these issues because thirteen years ago I was rear-ended at a different traffic light that changed red after only one car went through.  I stopped, but the truck behind me did not.  When the police came the driver of the truck readily admitted that they had not expected the light to change after only a single car made it through the intersection.  However, the police officer had no sympathy and said that it does not matter how long the light stays green.

My underlying question then has to be this, ‘Is there anyway for the public to report such issues to the local government agency responsible for maintaining the lights?’  Will it matter?  In some ways, this is no different than putting traffic cameras on corners and then shortening the length of the yellow light to make more drivers actually go through the red light because there is no way to stop before the intersection when the light first changes from green to yellow.  Now not all intersections with traffic cameras play this game, but enough shave off a second or two from the yellow light time just to augment the local municipality’s traffic ticket fine collections.  Again, the real problem here is not about the desire to control traffic, but the creation of a potentially unsafe traffic condition in which people have to decide between slamming on their brakes or risk a ticket by continuing through the intersection.

I suppose there is something to be said for driving defensively.  However, don’t take it to extremes either like the car I was following the other day that would slow down to about 25 mph at every intersection just on the outside chance that the light might turn red.  After all, in the last week alone I’ve seen drivers cut across three lanes of traffic to turn left from the far right hand lane.  I’ve seen motorcyclists weave down the expressway between cars moving at only about 30 mph in the evening rush hour traffic.  They even rode on the line between lanes to pass between cars.  Then there was the driver parked at the intersection with a map out trying to figure out where they were.  And don’t get me started about the drivers going down the expressway at 40 mph while everyone else is cruising at 60-70 mph because they are so focused on the cell phone conversation, they are not even aware that they are still driving.

Anyway, watch out for those flashing traffic lights, be safe, and keep reading here.  C’ya next time.

 

Finding Out When Enough Is Enough

Last week I asked the question, ‘How many people must you ask in a survey?’  While I talked about the topic in generalities at that time, I also mentioned that it would be interesting to test the hypothesis that a survey only needs to query a small percentage of the population to get meaningful results.

To test that theory, I took some data from a recent survey that was conducted over several months and disguised the question, but kept the results.  The question was a basic Likert Scale type question in which the question itself postulates a specific position and asks the survey take whether they agree or disagree with the statement.  This survey was conducted using the SharePoint Survey list and was set to allow a user to only answer the question once so as to not pad the ballot box so to speak.

The  total possible population of respondents was around 12,000.  Of course the survey owners wanted to get as many respondents as possible which is why they conducted the survey over several months.  However, I have always been of the opinion that for this survey, anything more than about a month of making the survey really did not serious effect the overall results.  By that I mean that the responses after about a month were a true representation of the total population and that there was no need to try to get 100% participation.

However, the question I chose to use for this study contained five possible responses listed below:

  • Strongly Agree
  • Agree
  • Neither Agree nor Disagree
  • Disagree
  • Strongly Disagree

Since I was collecting the data with SharePoint, I also stored the date on which each survey was taken.  Therefore, I could tell on any given date, how many responses have been entered since the start of the survey.  Knowing the total population, I could very easily determine the percent participation.  By exporting the data from SharePoint to an Excel spreadsheet, an extremely valuable option from a SharePoint survey, I could load the data into a PowerPivot data model and then create a variety of tables and charts based on the data.

The first figure I show below is the final tabulated results  after three and half months of data collection.  You can see that the total count of responses was only 8,203 out of a possible 12,000.  This represents a little more than 67% of the population.  Of the people who responded to the question (Yes, the question was changed to protect the guilty), ‘I believe Pivot Tables help me analyze data at work, 63.7% of them strongly agreed with the statement.  In fact, over 96% agreed or strongly agreed with the statement.  But my question was, did I need to poll 67% of the population to discover that?

Survey01

Going back to my PowerPivot table, I added a report filter (For those that don’t have PowerPivot, this data set is small enough that a simple Excel Pivot table would also work fine.).

Survey02

When I opened the filter dropdown as shown in the next figure, I can expand the All node of the value tree to show all the possible values in the table.  Note each date is represented as a separate entry.

Survey03

In order to select multiple dates as my filter, I need to click the checkbox at the bottom of the list box: Select Multiple Items.  This action places a checkbox next to each date as well as the All node.  By default, all records (dates in my case) are selected.

Survey04

I first need to unselect the checkbox next to the All node.  Then I can select only the dates that I want to appear in my table.  For example, in the next figure, I select only the first three days of the survey.

Survey05

When I click OK, my table updates and shows a total count of 214 survey responses on which 76.64% strongly agreed with the statement.  While this is close to the final 63.7% at the end of the survey period, it is still 13% away.  Obviously 3 days of a survey are not enough.

Survey06

I then chose 10 more days through February 2nd.

Survey07

This time with 1103 responses, my results for strongly agree was 65.55% and my total for strongly agree and agree were 96.7%.  Now I am getting really close to my final results and after only 13 days rather than 3 and a half months.

Survey08

I added another 10 days bringing my survey count up to 4023, nearly half of the three and half month result and my Strongly Agree percent is starting to settle in at 63.81%, only a tenth of a percent off of the final result.

Survey09

So, just for fun, (statistics is fun isn’t it?) I decided to chart the percentage of Strongly Agree responses as a function of the survey date.  I noticed that by the time I hit a month into the survey, my results had flattened out to around 64% plus or minus less than a half percent.

Survey10

I then plotted the percent response rate assuming a maximum of 12,000 possible responders and to only about a 15-17% response rate.

Survey11

So after surveying only about 15% of the population, I could say that the additional survey results over the next two and half months would not significantly affect my results.  Therefore, I could also say that it would be reasonable to assume that even though I only surveyed 67% of the total population, getting responses from the remaining 33% would probably not significantly change my results.

That is the power of surveys.  The trick is determining when the survey results begin to flatten out.  Every survey can be a little different and the number of possible answers to the survey will also affect the result (something we can maybe test in a future blog entry).

If I were plotting this data on a daily basis, I would have been able see when my results began to flatten and be able to ‘declare a winner’ with a great degree of certainty after a month and half or perhaps less.  In fact, with greater experience with similar types of data and by using questions with fewer possible answers, the size of the survey can be greatly reduced while retaining a high level of accuracy in the result.

I hope you found this interesting.  I chose to give the Tuesday blog a bit more of a technical twist this week because I am about to go on a summer writing schedule.  What does that mean?  I may drop back to one blog entry a week for most weeks.  There is just so many other things to do in the summer that are more fun than writing a blog, like cutting the grass and pulling weeds from the garden or even trimming overgrown bushes.  Anyway, I’ll try to keep a few non technical blogs in the mix each month to lighten up the reading from the dry technical topics.  When fall comes, I will switch back to two entries a week.

C’ya later.

Its Good To Be Regular!

It’s good to be a regular expression that is.  (What did you think I meant? 😉 )  Anyway, last week we talked about using patterns when defining domain rules.  While pattern matching can solve some problems, it cannot solve all problems.  For example, suppose you want a 4-character string that must begin with a letter between A-F.  Pattern matching may help you look for letters in a string, but it cannot limit which characters are acceptable.  (An ‘extreme’ case you may not have thought about since last week is that a character can be a numeric digit or symbol, but a number in a pattern cannot be a non-numeric character.)  Another good example is when I want the user to enter a hex code for a color.  Hex codes range from 00 to FF.

The way to define a domain rule for these situations is not by using a pattern.  Rather, a regular expression lets you control the specific characters allowed at every position in a string and can be extremely flexible.  First, let’s look at some of the rules as they might apply to a specific example.

Let’s go back to that first case where I want a 4-character string that begins with a letter between A-F.  I can begin the regular expression with the string: [A-F].  This would allow the string to begin with any character from ‘A’ through ‘F’.  However, the string definition of regular expressions would tell me that I really need to use [A-Fa-f] so that the user could enter either upper case or lower case letters.  While that is true for applications development using regular expressions to validate input, DQS treats the comparisons as case insensitive and so you can use either [A-F], [a-f], or [A-Fa-f].

Note that the text within the closed bracket represents just one character position even through several characters may appear.  If I wanted to validate against a non-sequential set of characters such as in [ABEFMPST], that would be a valid way to insure that the character of the domain value is one of these eight letters.

If I want to allow most letters in the alphabet with the exception of only a few, I could specify the characters not allowed in the character position using an expression like [^IOQ].  This expression would allow any character except the letters ‘I’, ‘O’, or ‘Q’.  By itself, this would also allow numeric digits so I may want to use [^IOQ0-9] instead.

Everything I talked about so far only applies to the first character.  In my example, I want the remaining three characters to be numbers.  I could change my regular expression to: [A-F][0-9][0-9][0-9].   However because the second through fourth characters are defined the same way, I can use the following expression to indicate that I want to use the same character definition for the next three characters.:  [A-F][0-9]{3}.  The number 3 in the curly brackets indicates that the previous character expression should be repeated for three characters.

Interestingly enough, this regular expression would also look for four consecutive characters in a larger string that began with a letter from A-F and was then followed by 3 digits.  In fact, it would declare the following value to be valid:  45 Main St, Ste D104.  You see, by itself, the regular expression is available to match characters anywhere within a string.  If I want to force the expression to match string values that begin with a specific sequence, I must start the string with the caret character as in: ^[A-F][0-9]{3}.  With this string as the regular expression, the above address would not be considered a valid match.

What if I don’t know how many time a character definition needs to be repeated?  I could use any of the following:

^[A-F][0-9]*    This allows for zero or more numbers after the letter

^[A-F][0-9]+    This allows for one or more numbers after the letter

^[A-F][0-9]?    This allows for zero or one numbers after the letter

^[A-F][0-9]{3,}   This allows for at least 3 numbers after the letter.

^[A-F][0-9]{3,6}   This allows for at least 3 but no more than 6 numbers after the letter.

So I might think that I could use ^[A-F][0-9]{3,3} to insure that valid values began with a letter followed by three and only three numbers.  Unfortunately, it does not work like this.  Rather, there is another character that I can add to the end of an expression that basically says that the string must end with the defined expression.  That character is the dollar sign.  Therefore, I could use ^[A-F][0-9]{3}$ to insure that the string only has four characters and that those characters begin with a letter A-F followed by three digits.

Let me say that what I have covered here is just the tip of the iceberg when it comes to regular expression capabilities.  There is much more that I can do with expressions.  However, my emphasis is to cover BI related topics such as PowerPivot, SSAS and DQS, not to go off on a multi-week tangent about regular expressions.  Therefore, I’m going to give you a few references to let you explore the richness of regular expressions on your own.

As I said, this has just been but a brief introduction into the world of regular expressions. There are many sites that will teach you how to build regular expressions.  A good place to start might be http://www.regular-expressions.info/tutorial.html.

In closing, you may want to download a free tool that will help you discover how regular expressions work.  The tool name is EditPad Pro 7 and can be downloaded from http:/www.editpadpro.com/download.html.

C’ya next time when I take a look at matching in DQS projects.

Mike