The Problem With Embeds, Part 2: Extracting And Testing Code

by Unlicensed user

In the first article in this series I said that most Clarion developers use embed points the wrong way, and by doing so they make their applications more difficult to maintain, test, debug and document. Almost every Clarion developer has done that; I've done it too. In these articles I intend to show how you can improve your code base by taking the majority of that code out of the embed points.

I'll be working through a couple of examples, including the Invoice app which ships with Clarion. But in this article I'll focus on the TXA embed parser example I introduced in Part 1.

In any analysis of embed code and how it might be better deployed, the first thing you need is a convenient way to look at just your embeds. That's not so easy because, embeds being embeds, they're sprinkled throughout your application.

Really you have two options: You can browse it in the embed list or in the embeditor, or you can use a tool to extract just the embeds so you can look at them without the distraction of the generated code.

About TXAs

As far as I know, there's only one way to programmatically extract the embed code from an APP. First, you have to export a TXA, which is an import/export text version of the information contained in an APP file. And second, you to parse the TXA to get the embeds, which can be a bit of a pain as the TXA format isn't documented.

A couple of years I wrote a TXA parser to do just that task, in support of an article series on the most popular embed points. So it seemed natural, when I needed a way to extract embed points for this series, to revisit that code.

Unfortunately, that code isn't very pretty. Most of it is contained in just a couple of embed points. In the TakeAccepted method's data section there are some declarations:

x              long
vars           group,pre()
procname         string(200)
procFromABC      string(60)
procCategory     string(60)
embedname        string(60)
embedPriority    long
embedParam1      string(200)
embedParam2      string(200)
embedParam3      string(200)
whenLevel        byte
               end
dumptrace      byte(0)
LastProcName  like(procname)
lastEmbedName string(500)
currembedname string(500)

And then a little later on in TakeAccepted, at an embed that's called when the user presses the Import button, the TXA gets parsed. That code loops through the records in a previously created queue of TXA files (txaq):

      ?progressvar{prop:rangehigh} = records(txaq)
      setcursor(cursor:wait)
      loop x = 1 to records(txaq)
         get(txaq,x)
         ?progressVar{prop:progress} = x-1
         clear(ema:record)
         EMA:TXA = txaq.name
         Access:EmbedApp.Insert()
         EmbedApp{prop:sql} = 'select last_insert_id()'
         next(EmbedApp)
         ! Add the queue header record
         access:TextFile.Close()
          GLO:TextFileName = txaq.name
         access:TextFile.Open()
         Access:TextFile.UseFile()
         set(TextFile)
         ProcName = ''
         state = 0
         lineNo = 0
         clear(procname)
         clear(lastprocname)
         clear(lastembedname)
         clear(currembedname)
         LOOP
            next(TextFile)
            if errorcode() then break.
            dumptrace = false
            lineNo += 1
            CASE state
            OF 0 ! search for the start of a module or procedure, or an embed
               if sub(txt:rec,1,11) = '[PROCEDURE]'
                  clear(vars)
                  state = 10
               elsif sub(txt:rec,1,8) = '[MODULE]'
                  clear(vars)
                  procName = '[MODULE]'
               elsif sub(txt:rec,1,7) = 'EMBED %'
                  embedName = sub(txt:rec,7,len(txt:rec))
                  state = 30
               elsif sub(txt:rec,1,8) = '[SOURCE]'
                  state = 50
               end
            OF 10 ! get procedure name details
               if sub(txt:rec,1,4) = 'NAME'
                 procName = sub(txt:rec,6,len(txt:rec))
                 state = 11
               end
               do CheckForMissedEmbed
            OF 11
               if sub(txt:rec,1,8) = 'FROM ABC'
                 procFromABC = sub(txt:rec,10,len(txt:rec))
                 state = 12
               end
               do CheckForMissedEmbed
            OF 12
               if sub(txt:rec,1,8) = 'CATEGORY'
                 procCategory = sub(txt:rec,11,len(clip(txt:rec))-11)
               end
               state = 0
               do CheckForMissedEmbed
            of 30 ! Look for a first embed parameter 
               if sub(txt:rec,1,11) = '[INSTANCES]'
                  state = 41
               elsif sub(txt:rec,1,8) = '[SOURCE]'
                  state = 50
               end
            of 41 ! Get first parameter
               if sub(txt:rec,1,6) = 'WHEN '''
                  embedParam1 = sub(txt:rec,7,len(clip(txt:rec))-7)
                  WhenLevel = 1
                  !db.out('whenlevel=' & whenlevel)
               end
               state = 42
               do CheckForMissedEmbed
            of 42 ! Look for a second embed parameter
               if sub(txt:rec,1,11) = '[INSTANCES]'
                  state = 43
               elsif sub(txt:rec,1,8) = '[SOURCE]'
                  state = 50
               end
            of 43 ! Get second parameter
               if sub(txt:rec,1,6) = 'WHEN '''
                  embedParam2 = sub(txt:rec,7,len(clip(txt:rec))-7)
                  WhenLevel = 2
               end
               state = 44
               do CheckForMissedEmbed
            of 44 ! Look for a third embed parameter
               if sub(txt:rec,1,11) = '[INSTANCES]'
                  state = 45
               elsif sub(txt:rec,1,8) = '[SOURCE]'
                  state = 50
                  !db.out('found PRIORITY')
               end
            of 45 ! Get third parameter
               if sub(txt:rec,1,6) = 'WHEN '''
                  embedParam3 = sub(txt:rec,7,len(clip(txt:rec))-7)
                  WhenLevel = 3
                  !db.out('whenlevel=' & whenlevel)
               end
               state = 50
               do CheckForMissedEmbed
            of 50  ! look for the priority
               if sub(txt:rec,1,8) = 'PRIORITY'
                  embedPriority = sub(txt:rec,10,len(txt:rec))
                  if lastprocname <> procname
                     ! insert new EmbedProc record
                     clear(EMP:record)
                     EMP:Proc = procname
                     EMP:ProcFromABC = ProcFromABC
                     EMP:ProcCategory = ProcCategory
                     EMP:EmbedAppID = EMA:EmbedAppID
                     Access:EmbedProc.Insert()
                     EmbedProc{prop:sql} = 'select last_insert_id()'
                     next(EmbedProc)
                     lastprocname = procname
                  end
                  ! Add the embed record
                  currEmbedName = clip(embedName) & clip(embedparam1) |
                     & clip(embedparam2) & clip(embedparam3) & embedpriority
                  if currEmbedName <> lastEmbedName
                     lastEmbedName = currEmbedName
                     EMB:EmbedProcID = EMP:EmbedProcID
                     EMB:Embed = EmbedName
                     EMB:Param1 = EmbedParam1
                     EMB:Param2 = EmbedParam2
                     EMB:Param3 = EmbedParam3
                     EMB:Priority = embedpriority
                     access:Embed.Insert()
                  end
                  state = 51
                end
            of 51
               state = 60
               do CheckForMissedEmbed
            OF 60  ! capturing embed
               ! Quit when [END] encountered
               if sub(txt:rec,1,1) = '['
                  if sub(txt:rec,1,5) = '[END]'
                     case WhenLevel
                     of 3
                        WhenLevel = 2
                        embedParam3 = ''
                     of 2
                        WhenLevel = 1
                        embedParam2 = ''
                     of 1
                        WhenLevel = 0
                        embedParam1 = ''
                     end
                     state = 0

                  elsif sub(txt:rec,1,8) = '[SOURCE]'
                     ! look for another embed under this [EMBED] point
                    state = 50
                  else
                     ! could be we're done
                     state = 0
                  end
              elsif sub(txt:rec,1,6) = 'WHEN '''
                  case WhenLevel
                  of 0
                     ! get the first param
                     embedParam1 = sub(txt:rec,7,len(clip(txt:rec))-7)
                     WhenLevel = 1
                     state = 42
                  of 1
                     ! get the second param
                     embedParam2 = sub(txt:rec,7,len(clip(txt:rec))-7)
                     WhenLevel = 2
                     state = 50
                  of 2
                     state = 50
                  end
               else
                  ! write embed buffer 
               end
            else
               do CheckForMissedEmbed
            END
         END
         Access:TextFile.Close()
      end
      ?progressVar{prop:progress} = records(txaq)
      setcursor()

In the downloadable zip have a look at the ImportTXAs procedure in Embeds.app for the complete source.

What I had in mind for my new utility was something more along these lines:

My original parser's functionality was overkill for this new app; I really didn't need to build up an elaborate database of applications, procedures, and embed points. I just needed a list of embed points. But obviously I needed all of the parsing capabilities, gnarly though the code might be.

Unfortunately, there wasn't any way to use my original code unchanged, primarily because that code didn't actually extract the embedded source from the TXA - there was no need to capture the actual embed code since I was just logging embed usage. And I was motivated not to store the embed code: I had asked for TXA submissions which could contain sensitive information, and I didn't want to accidentally expose anyone's embed code to public view.

So what were my options? A few came to mind:

Just cut and paste the source. This is what a lot of us do, and it has the obvious drawbacks of creating multiple versions of the code to maintain. If I find a bug in my parsing code, I'll need to hunt down every place I've pasted a copy and make the change there.
Put my original procedure in a DLL and call it as needed. In this case my DLL also presents a user interface. That would result in some UI clunkiness in the app I envisioned in Figure 1, which doesn't need to call yet another window just to do the parsing.
Put the common source in source files and INCLUDE them. I could even include just portions of the files using labels. The drawback here is that there's no way to know how the various sections of source code might be used, or how any bug fixes to that source might cause unexpected bugs. Using INCLUDE statements this way results in an almost complete loss of control over the source code.
Create a template containing the source code. This certainly helps keep the code in one place, but it has a lot of negatives when it comes to maintaining and testing that code since you have to put the template in an app and generate the code, and then you have to port any changes back to the template.

And there were other problems. Because my original app was tied to a particular data store (a PostgreSQL database), any re-use of that code would have to know the table definitions. Since Clarion only supports one dictionary per app, any apps that used this procedure would either need to use the dictionary or import the tables from that dictionary.

Class or procedure?

So what's the answer? If it's not a template, and not a multi-purpose generated procedure and not INCLUDEd source, what's left? Procedures and classes, that's what. But not just any procedures or classes. I wanted to write code that had as few dependencies as possible.

Some of the dependencies I wanted to avoid:

Files/tables - not tie my code to a specific database
Windows/controls - not tie my code to a specific user interface element
Other code - keep calls to other procedures/classes to a minimum

So when should you use a class, and when a procedure?

In almost all cases a class is preferable to a procedure, in the same way that a procedure is almost always preferable to spaghetti code. A procedure presents a single point of entry and a single result. That's not to say you can't return multiple values from a procedure - you clearly can, as Steve Parker h as showed. But procedures don't have the flexibility of classes.

In fact, a class method is really just a procedure, so using a class already gets you everything a procedure can do. But it also gets you more, because in a class multiple methods can operate on shared data.

I'm not going to go into all the details of how to create a class; I'll cover that in following articles. For now just keep your eye on the transformation of the embedded code into a class, and don't worry excessively (yet) about exactly how it was done.

The class

The first decision I have to make is how to store the embed data. Originally I used a SQL database, but this now appears to be a liability. My parser should use something more transient, which can be converted to a permanent store or just discarded after use. An in-memory data store, in the form of nested queues, fits the bill:

TxaEmbedLineQueue   queue,TYPE
line                    cstring(1000)
                    END
TxaEmbedQueue       queue,type
embedname               string(100)
EmbedLineQ             &TxaEmbedLineQueue                        
                    END
TxaProcedureQueue   queue,TYPE
ProcName                string(40)
EmbedQ                  &TxaEmbedQueue
                    END

The parser's job will be to populate these queues so that I end up with a TxaProcedureQueue containing one or more records. Each of those procedure records has one or more TxaEmbedQueue records, each of which has one or more TxaEmbedLineQueue records for each line in a given embed point. With that queue structure in hand I can easily update a database or create a text file, as I see fit.

At first blush this looks like an ideal task for a procedure. Pass in the name of the TXA and an empty queue and get back a filled queue: presto! But here's why I think the procedural solution is hardly ever a good solution: there's almost always some new functionality you can add which doesn't fit into the existing procedure.

Think about error handling. You can have the parsing procedure return an errorcode if the parse fails for any reason, but what if you want to get a more detailed error response? What if you want to enable tracing or logging? How would you do that in a single procedure call, without burdening the procedure with some obscenely large number of parameters?

In fact, once I began rewriting my parsing code as a class I ended up with rather a lot of methods and a few properties as well:

TxaParserClass      CLASS,type,module('txaparserclass.clw'),link('txaparserclass.clw',|
                          _ABCLinkMode_),DLL(_ABCDllMode_)
ErrorMessage            string(500),PRIVATE
ExternalQueue           byte(false)
dbg                     &Debuger,PRIVATE
ProcQ                   &pTxaProcedureQueue
AddNewEmbed             procedure(string embed,string embedparam1,|
                                  string embedparam2,string embedparam3,|
                                  string embedpriority),private
AddNewProcedure         procedure(string pname),private
CheckForMissedEmbed     procedure(string s,long lineno,long state),private
Construct               PROCEDURE
RemoveCurrentProcedureFromQueue procedure,private
GetLastError            procedure,string
Parse                   PROCEDURE(string s),BYTE
Reset                   PROCEDURE
SetDebuger              procedure(Debuger dbg)
SetQueue                procedure(TxaProcedureQueue procq)
Trace                   PROCEDURE(string s),private
Write                   procedure(string filename),byte,proc
                    end

I won't go into all of these methods in detail - you can have a look at the downloadable source if you're interested. Briefly, there are a few private methods to break up the internal code into reusable blocks (you'd use routines in a procedural implementation), and there are some public methods to get the last error, parse the specified TXA, reset the parser, assign a debugging object, specify the particular queue to use, and write the contents of the queue out to a text file.

Now, show me how you'd implement all that in a procedure!

I now have a Write method that dumps my embeds to a text file, but I could also create another Write method that would dump the embeds out to a database. I'd have to do this by passing in file and field references, and perhaps FileManager references, but it could be done, and it would l ensure the code remained portable between not just apps but also dictionaries.

Automated testing!

Perhaps more interesting to me than simple reuse is that by moving my code into a class I've made it testable. But what does testing really mean, in the Clarion world? Usually it means writing some code in an embed point, running the app, clicking some clicks and watching what happens. That's useful, but it doesn't necessarily tell you what you need to know about the reliability of your core code. And it's tedious.

You can take away some of the tedium with automated testing tools, except that historically Clarion hasn't played well with those tools because Clarion apps use custom Windows controls.

But you still won't be testing your app's core functionality in a repeatable way.

Testing is a complex subject, and there are lots of different ways to test application code, but certainly one of the most useful is unit testing. The core idea is that you reduce your code to the smallest testable units, and then you run tests on a regular basis. Those tests must be automated; they have to run without user intervention (other than to kick off a series of tests, although you might want a test suite to run automatically on a regular basis, or as part of a build).

Unit testing is commonplace in the .NET world, in part because certain features of the CLR (in particular, reflection) make it fairly simple for testing tools to examine assemblies (DLLs), locate classes and methods marked as test code, run those tests and report on the results.

Automated testing is a bit more awkward in a language like Clarion, but still possible, as Figure 2 shows.

Figure 2. Testing the parser class

Testing the parser

Figure 2 is a demostration of two applications, neither of which I've yet discussed. The application that you see running is called (tentatively) CTest, and is a Clarion test runner. CTest's job is to load up a specified DLL, search it for test procedures, run those test procedures and report on the results. It's very loosely patterned on the kinds of unit testing applications readily available to .NET developers. I'll have an article describing the inner workings of CTest next month.

The second app involved in Figure 2 is a DLL APP called TxaParserTest. This APP contains some test procedures created using a special version of the Source procedure template.

Here's the code for a procedure called Test_ParseTXA_OpenFile_ReturnTrue. First there are some data declarations:

parser                                  TxaParserClass
q                                       TxaProcedureQueue

txa                                     string(500)

And then there's some code:

    parser.SetQueue(q)

    txa = '..\Invoice\invoice.txa'
    if parser.parse(txa)
        tr.Passed = true
    ELSE
        tr.message = parser.GetLastError()
    END

There's some additional code which is automatically generated by the procedure template; I've only shown the embed code I added to create the test. For instance, the tr object is an instance of TestResultT, which is a utility class used to report results back to the test runner (CTest), and tr.Passed defaults to false. Again, I'll get into the details in a future article.

The entire purpose of this test is simply to verify that the parser's Parse method execute successfully, indicating the specified TXA was found and something was done with it.

The Test_ParseTXA_GetProcedures_VerifyCount test verifies that the parser found the correct number of procedures containing embeds. The data is similar to the previous test, so here's just the code::

    parser.SetQueue(q)
    txa = '..\Invoice\invoice.txa'
    if parser.parse(txa)
        if records(q) = 13
            tr.Passed = true
        ELSE
            tr.message = 'Expected 13 procedures but found ' |
              & records(q)
        END
    ELSE
        tr.message = parser.GetLastError()
    END

Test_ParseTXA_GetEmbeds_VerifyCount is similar but requires some additional data:

expectedEmbedCount                      long,dim(20)
x                                       long

And the code:

    parser.SetQueue(q)
    txa = '..\Invoice\invoice.txa'
    if parser.parse(txa)
        expectedEmbedCount[1] = 1
        expectedEmbedCount[2] = 1
        expectedEmbedCount[3] = 2
        expectedEmbedCount[4] = 4
        expectedEmbedCount[5] = 1
        expectedEmbedCount[6] = 3
        expectedEmbedCount[7] = 4
        expectedEmbedCount[8] = 3
        expectedEmbedCount[9] = 2
        expectedEmbedCount[10] = 1 
        expectedEmbedCount[11] = 8 
        expectedEmbedCount[12] = 4 
        expectedEmbedCount[13] = 2         
        tr.passed = true 
        loop x = 1 to records(q)
            get(q,x)
            if expectedEmbedCount[x] <> records(q.EmbedQ)
                tr.Message = 'Test failed on index ' & x |
                  & ', procname' & q.ProcName & ', count ' |
                  & records(q.EmbedQ)
                tr.passed = false
                break
            END
        END        
    ELSE
        tr.message = parser.GetLastError()
    END

Here's another test, this time called Test_ParseTXA_WriteEmbedLog_VerifyExistence. Here's the code:

    txa = '..\Invoice\invoice.txa'
    parser.SetQueue(q)
    if parser.parse(txa)
        if parser.Write('ThisIsATestFileAndCanBeDeleted.txt')
            if ~exists('ThisIsATestFileAndCanBeDeleted.txt')
                tr.message = 'Output file was not created'
            ELSE
                tr.passed = true
            END
        else
            tr.message = parser.GetLastError()
        END
        
    ELSE
        tr.message = parser.GetLastError()
    END

This test verifies that the parser was able to write a specific test file.

These tests are by no means exhaustive, but they do illustrate the usefulness of extracting code from embed points. Not only can I reuse that code in other applications, I can test the code and gain confidence that it's doing exactly what it should be doing. This is especially important when it comes to modify the code, either because a bug was found (in which case there should be a new test that confirms the bug fix) or because some new feature is needed.

Any time you change code you run the risk of introducing new bugs; having a comprehensive test suite greatly reduces the likelihood of those new bugs going unnoticed. It's also a great way to verify that software upgrades (whether Clarion, or third party products, or even operating systems) haven't broken your code.

Obviously, this kind of code extraction works best with code that isn't tied to the UI and isn't dependent on a particular database, although in most cases, and with TPS files in particular, you can still run automated tests more easily against a database than you can against a user interface. But that's also my point: the code that really gives you application value is almost always code that doesn't depend on the UI or on a particular physical database. It may depend on a certain data structure, but that structure seldom has to be just a TPS file or only a SQL table or whatever specific implementation you currently use.

Embed code reduction in the embed utility

So what's the end result of refactoring my parser into a class? After exporting the TXA from my ListEmbeds.app utility, I ran that utility against the TXA and got this output:

PROCEDURE: Main

  EMBED: %ControlHandling ?TxaName 4000

        do SetOutputFilename

  EMBED: %ControlHandling 4000

              do SetOutputFilename

  EMBED: %ProcedureRoutines 500

    SetOutputFilename   ROUTINE
        if clip(txaname) = ''
            outputfile = ''
        else
            OutputFile = clip(TxaName) & '.embeds.txt'
        END
        display(?outputfile)

  EMBED: %DataSection 1300

    parser              &TxaParserClass
    q                   TxaProcedureQueue

  EMBED: %ControlEventHandling ?go Accepted 2500

            parser &= new TxaParserClass()
            setcursor(cursor:wait)
            parser.SetQueue(q)
            parser.Reset()
            if parser.parse(TxaName)
                if not parser.Write(OutputFile)
                    message('Unable to write output file: ' & parser.getlasterror())
                END
            ELSE
                message('Unable to parse txa: ' & parser.getlasterror())
            END
            do Viewer4:Initialize
            display()
            setcursor()
            dispose(parser)

There's still a bit of embed code there, but now none of it contains any significant business logic. All the critical bits are inside my parser class.

There is a template that does this already, you know

Even back when I wrote the original parser I was pretty sure there was a template that would extract embed points, and Steve Parker finally pointed it out to me: Bo Schmitz' free BoTpl utility can extract embed points from applications. Bo's excellent template does this by exporting a TXA and parsing the result. Sound familiar?

As useful as Bo's template is (and I highly recommend you get it and use it) I think it also points out the value of putting business logic into classes rather than into templates. Source code can be easily tested, and even debugged with the rudimentary Clarion debugger; templates are much more difficult to test and there is no true template debugger, only logging tools.

No, I'm not done yet

As has often happened to me, in the course of writing my tests I discovered some new features I wanted to implement, and also a bug or two that my initial set of tests hadn't uncovered. I'll explore these issues another time; at some point I'll also detour into an explanation of the CTest test runner app, and then it'll be on to the semi-real-world example of the Invoice app.

The downloadable source is a C7 application containing four APPs and a source code project. You can also load up any of these projects individually.

CTest - the test runner application
Embeds - the original embed application from the article series (doesn't extract embed source)
ListEmbeds - the new app to list embed source
TxaParser - a hand coded project containing the TXA parser class and supporting code
TxaParserTest - an app containing unit test procedures for the TXA parser

Figure 3. The solution

As well there are libsrc and template directories containing supporting templates and classes you may want to copy to your libsrc and template directories.

Read Part 3.

Download the source with DLLs (15 megs)

Download the source only (1 meg)