Functions for cleaning imported text

 

The following functions were used to clean text imported from an Excel Workbooks into Access. In spite of the fact that Excel has a notional limit on the amount of text a user can put in a cell, when users develop their own ‘database’ in Excel there is no limit to the ingenuity to avoid or work round the limits that the software tries to impose on them. Although these were written specifically for one application their application in any situation where it is necessary to clean a text string will be fairly obvious and the changes needed to implement in Access rather than in Excel are straight forward.

 

I cannot claim credit for writing these but the person who wrote them knows who he is and has given me permission to publish them. We worked together for about 8 months and by the end of that time we were writing code as if it had been written by the same person. The only way we could tell who had written what was by virtue of how we had initially split the system in two, but as time progressed we had each amended each others routines.  

 

The RemoveHardReturns Function removes the hard carriage returns from a cell string.

 

Function RemoveHardReturns(strCurrCell as String) As String

     On Error GoTo Err_RemoveHardReturns

     Dim intLC as Integer

     Dim strCurrChar as String

     For intLC = 1 To Len(strCurrCell)

          strCurrCell = Mid(strCurrCell, intLC, 1)

          If strCurrChar = Chr(10) or strCurrChr = Chr(13) Then

              strCurrCell = Left(strCurrCell, intLC -1) & Right(strCurrCell, Len(strCurrCell) – intLC)

          End If

     Next intLC

     RemoveHardReturns = strCurrCell

 

Exit_RemoveHardReturns:

     Exit Function

 

Err_RemoveHardReturns:

     MsgBox Err.Description

     Resume Exit_RemoveHardReturns

 

End Function

 

The RemoveDoubleSpaces will find Double Spaces in a string and remove them.

 

Function RemoveDoubleSpaces(strCurCell as String) As String

     On Error GoTo Err_RemoveDoubleSpaces

     Dim intLC as Integer

     Dim strCurrChar as String

     Dim strPrevChar as String

     strCurrChar = “”

     strPrevChar = “”

     For intLC = 1 To Len(strCurrCell)

          strCurrChar = Mid(strCurrCell, intLC, 1)

          If strCurrChar = “ “ And strPrevChar = “ “ Then

              strCurrCell = Left(strCurrCell, intLC – 1) & Right(strCurrCell, Len(strCurrCell) – intLC)

              intLC = intLC – 1

          End If

          strPrevChar = strCurrChar

     Next intLC

     RemoveDoubleSpaces = strCurrCell

 

Exit_RemoveDoubleSpaces:

     Exit Function

 

Err_RemoveDoubleSpaces:

     MsgBox Err.Description

     Resume Exit_RemoveDoubleSpaces

 

End Function

 

The InsertSpaceAfterPeriod Function is designed to make the text in a cell look like a sentence. This may cause a problem where a user has inserted text with legal numbering in to refer to clauses of an agreement so should be treated with due caution.

 

Function InsertSpaceAfterPeriod(strCurrCell as String) as String

     Dim intLC as Integer

     Dim strCurrChar as String

     Dim strPrevChar as String

     strCurrChar = “”

     strPrevChar = “”

     For intLC = 1 To Len(strCurrCell)

          strCurrChar = Mid(strCurrCell, intLC, 1)

          If strCurrChar <> “ “ And Not IsNumeric(strCurrChar) And strPrevChar = “.” Then

              strCurrCell = Left(strCurrCell, intLC – 1) & “ “ & Right(strCurrCell, Len(strCurrCell) – intLC + 1)

          End If

          strPrevChar = strCurrChar

     Next intLC

     InsertSpaceAfterPeriod = strCurrCell

 

Exit_ InsertSpaceAfterPeriod:

     Exit Function

 

Err_ InsertSpaceAfterPeriod:

     MsgBox Err.Description

     Resume Exit_ MsgBox Err.Description

 

End Function

 

If you then write a Procedure that loops through a range of cells applying as many of the above Functions as your application requires you should end up with reasonably clean text to import into your database of choice. In our case it was Access and we encountered a number of other issues to do with numbers masquerading as text and dates doing all sorts of strange impersonations.