FPGARelated.com
Forums

Re: Got UART Working!!! need syntax help with using ascii/buffer scheduling.

Started by Jonathan Bromley January 27, 2009
On Mon, 26 Jan 2009 17:26:25 -0800 (PST), jleslie48 wrote:

>Yeah I'm gonna fall into that linear C programming trap a lot.
[...]
>never liked threads ;)
Time to change, if you're trying to use VHDL and hardware.
>the SelectCntr process as you pseudo coded up, will use its clk >to make the looper go through the the TstData, but the routine is >going to >need a governor of some sorts so that message only goes out once yes? >I mean that either I have to control the clk to it or make another >semaphore that >turns on on reset and off on TxCntr = 15.
OK, let me try to appeal to the programmer in you. Bear with me through the long-winded waffle. I'll do some hands-on practical stuff at the end. Your conventional programming paradigm is almost purely procedural (OOP changes that, but not by very much). Procedural code can be built up - composed - into bigger blocks of procedural code using well-known schemes: loops, function calls. But whatever you do in C, you end up with something *procedural*. In VHDL and any HDL, procedural code is not the only act in town. You can do all the things you do in C, BUT YOU MUST DO THEM WITHIN THE CONFINES OF A PROCESS. So we have procedural loops, we have functions and procedures, we have conditional statements, all just like C (modulo the obvious syntax differences). You compose these things into a bigger lump of procedural code, and then - and this is where the big mental shift must occur - you wrap it up as a PROCESS. As a side-effect of that, it also gets wrapped in an implied infinite loop and optionally gets a sensitivity list. From this point on, you have completely lost the power to do procedural composition. The process is, irrevocably, a concurrent block that must be composed structurally with all the other processes in your design or simulation model. Threads, whether you like them or not :-) As a matter of convenience and design organization, you then wrap a process, or a group of processes, into an entity/architecture (or a module, in Verilog). To use that entity you must instantiate it within another architecture; the instance so created is, in every meaningful sense, just a rather big process. So, let's say it just one more time: Inside a process, you write procedural code using all the sequential composition techniques you know and love. Processes, though, club together by parallel or structural composition - there is NO way to get one process or entity to "call" another. (I'm talking VHDL here. The story is a little different in Verilog, but even there I'm not wildly wrong.) (Historical note for old grumps like me: Thirty years ago, occam and CSP had this sorted out, with arbitrary sequential and parallel composition of processes at all levels of the hierarchy. But people who said "never liked threads" killed it dead. One day the wheel will turn, but not until the C programmer hegemony is smashed.) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ OK, so now let's go back to the problem faced by every programmer who tries to get to grips with HDL design: How do I get one module to call (that is, to make use of functionality in) another? First, let's think a little about how these process descriptions get mapped to hardware. You've already seen, and successfully used, the clocked process template. I'm sure that by now you understand how this works: on each clock edge, your process looks at the values of all signals and variables, and uses those values to determine the values they will have immediately after the clock edge (the "next-state" value). The whole of the process executes in zero time at the moment of the clock edge, and does NOT stop to wait for other things to happen. Consequently any state information that needs holding from one clock cycle to the next (such as "which character of the string am I sending at the moment") must have explicit storage. If you follow this approach, you have a piece of code that can be synthesized. So: you have now built your UART transmitter, and you would like to get it to send a message. As a programmer you desperately want to call a function in the UART that says "send a character". But you can't do that, because the UART is a process (OK, several processes composed into an entity). So you must use the ports on the entity to control it. By far the safest, clearest-headed way to think about this is to organize your blocks as producers or consumers of data. The UART transmitter is a consumer. It has a "ready" output of some sort, which means "on this clock edge it's OK for you to post some new data to my Tx buffer". And it has a "data valid" input of some sort, which the producer uses to say "on this clock edge, my data is valid and I want you to take it". Given this two-wire clocked handshake, we can now separate out the producer. Remember that it's a clocked process, so it must execute the whole of its process body on each and every clock. Traditionally this is coded as a state machine (I guess you use them in software too). Here's how I see the state machine shaping up: if rising_edge (clock) then -- Some outputs are asserted only in one -- state. It's easiest to establish their -- default or idle value here: UART_Tx_Data_Valid <= '0'; Ready_For_Start <= '0'; -- We don't bother to default the UART Tx data, -- because it's ignored if UART_Tx_Data_Valid is false. -- Everything is governed by the state value - -- where have we got to in the sequence? case (current_state) is when waiting_for_start => -- Tell the external world that it is OK to -- ask us to start sending. Ready_For_Start <= '1'; -- Hold until we see an external trigger. if Start_Message_Trigger = '1' then current_state <= send_a_char; msg_pointer <= 1; end if; -- otherwise remain waiting_for_start when send_a_char => -- First, check if we've fallen off the end of -- the message - ASSUMES message is null-terminated. if my_msg(msg_pointer) = NUL then current_state <= waiting_for_start; -- Otherwise, hold until the UART is ready. elsif UART_Tx_Ready = '1' then -- Prepare data for UART. UART_Tx_Data <= to_slv(my_msg(msg_pointer)); -- Drive UART data-valid strobe. UART_Tx_Data_Valid <= '1'; -- Increment ready for the next character. msg_pointer <= msg_pointer + 1; end if; when others => -- oops, current_state somehow got a bad value; -- maybe take evasive action such as resetting -- various things. end case; end if; -- clock Now, there are lots of hints here and this code is (I think) basically sound, but I've left loads of details for you to fill in - most particularly, the data declarations. But I hope it's given you a clue about how you can write completely independent consumers and producers of data, locked together by a simple clocked handshaking protocol using "ready" and "valid" signals (there are, of course, many possible ways to do the details, but that one's easy to understand). Note, too, that my "producer" here is in fact also a "consumer" in its own right; it has a start/done handshake with the next higher level producer, which I guess will take responsibility for filling-in the message data. After all, you presumably want at least some control over when and what message gets sent. Start_Message_Trigger could easily be an external push-button for the purpose of your tests. If Start_Message_Trigger is false, the state machine will stick in its waiting_for_start state. If you want to make a constant message for initial testing, you can easily null-terminate it thus: constant my_msg: string := "This is it" & NUL; Sometimes it may be better to use a string length count rather than null-termination; in that case, your end-of-message test would instead be if msg_pointer = msg_length then ... But hey - you're a programmer - you can easily see all that sort of stuff. It's the block-to-block communication and handshaking that has you confused, right? Let us know how you get on. -- Jonathan Bromley, Consultant DOULOS - Developing Design Know-how VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK jonathan.bromley@MYCOMPANY.com http://www.MYCOMPANY.com The contents of this message may contain personal views which are not the views of Doulos Ltd., unless specifically stated.
On Jan 27, 4:17 am, Jonathan Bromley <jonathan.brom...@MYCOMPANY.com>
wrote:
> On Mon, 26 Jan 2009 17:26:25 -0800 (PST), jleslie48 wrote: > >Yeah I'm gonna fall into that linear C programming trap a lot. > [...] > >never liked threads ;) > > Time to change, if you're trying to use VHDL and hardware. > > >the SelectCntr process as you pseudo coded up, will use its clk > >to make the looper go through the the TstData, but the routine is > >going to > >need a governor of some sorts so that message only goes out once yes? > >I mean that either I have to control the clk to it or make another > >semaphore that > >turns on on reset and off on TxCntr = 15. > > OK, let me try to appeal to the programmer in you. > Bear with me through the long-winded waffle. I'll > do some hands-on practical stuff at the end. > > Your conventional programming paradigm is almost purely > procedural (OOP changes that, but not by very much). > Procedural code can be built up - composed - into > bigger blocks of procedural code using well-known > schemes: loops, function calls. But whatever you > do in C, you end up with something *procedural*. > > In VHDL and any HDL, procedural code is not the only > act in town. You can do all the things you do in C, > BUT YOU MUST DO THEM WITHIN THE CONFINES OF A PROCESS. > So we have procedural loops, we have functions and > procedures, we have conditional statements, all just > like C (modulo the obvious syntax differences). You > compose these things into a bigger lump of procedural > code, and then - and this is where the big mental > shift must occur - you wrap it up as a PROCESS. As > a side-effect of that, it also gets wrapped in an > implied infinite loop and optionally gets a sensitivity > list. > > From this point on, you have completely lost the power > to do procedural composition. The process is, irrevocably, > a concurrent block that must be composed structurally with > all the other processes in your design or simulation model. > Threads, whether you like them or not :-) > > As a matter of convenience and design organization, you > then wrap a process, or a group of processes, into an > entity/architecture (or a module, in Verilog). To use > that entity you must instantiate it within another > architecture; the instance so created is, in every > meaningful sense, just a rather big process. > > So, let's say it just one more time: Inside a process, > you write procedural code using all the sequential > composition techniques you know and love. Processes, > though, club together by parallel or structural > composition - there is NO way to get one process or > entity to "call" another. (I'm talking VHDL here. > The story is a little different in Verilog, but > even there I'm not wildly wrong.) > > (Historical note for old grumps like me: Thirty > years ago, occam and CSP had this sorted out, with > arbitrary sequential and parallel composition of > processes at all levels of the hierarchy. But > people who said "never liked threads" killed it > dead. One day the wheel will turn, but not until > the C programmer hegemony is smashed.) > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > OK, so now let's go back to the problem faced by > every programmer who tries to get to grips with > HDL design: How do I get one module to call (that > is, to make use of functionality in) another? > > First, let's think a little about how these > process descriptions get mapped to hardware. > You've already seen, and successfully used, the > clocked process template. I'm sure that by now > you understand how this works: on each clock > edge, your process looks at the values of all > signals and variables, and uses those values > to determine the values they will have immediately > after the clock edge (the "next-state" value). > The whole of the process executes in zero time > at the moment of the clock edge, and does NOT stop > to wait for other things to happen. Consequently > any state information that needs holding from one > clock cycle to the next (such as "which character > of the string am I sending at the moment") must > have explicit storage. If you follow this approach, > you have a piece of code that can be synthesized. > > So: you have now built your UART transmitter, and > you would like to get it to send a message. As a > programmer you desperately want to call a function > in the UART that says "send a character". But you > can't do that, because the UART is a process (OK, > several processes composed into an entity). So you > must use the ports on the entity to control it. > By far the safest, clearest-headed way to think about > this is to organize your blocks as producers or > consumers of data. The UART transmitter is a > consumer. It has a "ready" output of some sort, > which means "on this clock edge it's OK for you > to post some new data to my Tx buffer". And it has > a "data valid" input of some sort, which the producer > uses to say "on this clock edge, my data is valid > and I want you to take it". Given this two-wire > clocked handshake, we can now separate out the > producer. Remember that it's a clocked process, > so it must execute the whole of its process body > on each and every clock. Traditionally this is > coded as a state machine (I guess you use them in > software too). Here's how I see the state machine > shaping up: > > if rising_edge (clock) then > > -- Some outputs are asserted only in one > -- state. It's easiest to establish their > -- default or idle value here: > UART_Tx_Data_Valid <= '0'; > Ready_For_Start <= '0'; > -- We don't bother to default the UART Tx data, > -- because it's ignored if UART_Tx_Data_Valid is false. > > -- Everything is governed by the state value - > -- where have we got to in the sequence? > case (current_state) is > > when waiting_for_start => > -- Tell the external world that it is OK to > -- ask us to start sending. > Ready_For_Start <= '1'; > -- Hold until we see an external trigger. > if Start_Message_Trigger = '1' then > current_state <= send_a_char; > msg_pointer <= 1; > end if; -- otherwise remain waiting_for_start > > when send_a_char => > -- First, check if we've fallen off the end of > -- the message - ASSUMES message is null-terminated. > if my_msg(msg_pointer) = NUL then > current_state <= waiting_for_start; > -- Otherwise, hold until the UART is ready. > elsif UART_Tx_Ready = '1' then > -- Prepare data for UART. > UART_Tx_Data <= to_slv(my_msg(msg_pointer)); > -- Drive UART data-valid strobe. > UART_Tx_Data_Valid <= '1'; > -- Increment ready for the next character. > msg_pointer <= msg_pointer + 1; > end if; > > when others => > -- oops, current_state somehow got a bad value; > -- maybe take evasive action such as resetting > -- various things. > > end case; > > end if; -- clock > > Now, there are lots of hints here and this code is > (I think) basically sound, but I've left loads of > details for you to fill in - most particularly, the > data declarations. But I hope it's given you a clue > about how you can write completely independent consumers > and producers of data, locked together by a simple > clocked handshaking protocol using "ready" and "valid" > signals (there are, of course, many possible ways to > do the details, but that one's easy to understand). > > Note, too, that my "producer" here is in fact also > a "consumer" in its own right; it has a start/done > handshake with the next higher level producer, which > I guess will take responsibility for filling-in the > message data. After all, you presumably want at least > some control over when and what message gets sent. > Start_Message_Trigger could easily be an external > push-button for the purpose of your tests. If > Start_Message_Trigger is false, the state machine > will stick in its waiting_for_start state. > > If you want to make a constant message for initial > testing, you can easily null-terminate it thus: > > constant my_msg: string := "This is it" & NUL; > > Sometimes it may be better to use a string length > count rather than null-termination; in that case, > your end-of-message test would instead be > if msg_pointer = msg_length then ... > > But hey - you're a programmer - you can easily > see all that sort of stuff. It's the block-to-block > communication and handshaking that has you confused, > right? Let us know how you get on. > -- > Jonathan Bromley, Consultant > > DOULOS - Developing Design Know-how > VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services > > Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK > jonathan.brom...@MYCOMPANY.comhttp://www.MYCOMPANY.com > > The contents of this message may contain personal views which > are not the views of Doulos Ltd., unless specifically stated.
Rick, Jonathan, First of Jonathan, its far from long-winded waffle. You are absolutely recognizing my dilemma/misconceptions. What you are saying is absolutely correct, I've got to get it into my way of thinking, I will keep working at it. I see EE guys tremble in fear at a double nested bubble sort algorithm, something I take for granted, and here I am with the rolls reversed on something as silly as putting out the "hello world" message on a screen... Rick, I don't believe it's the 16550 Uart, I've been through so many looking for something I could use as a template, I'm bleary eyed. This is the one I got working: http://grace.evergreen.edu/dtoi/arch06w/asm/KCPSM3/ its the: UART Transmitter and REceiver Macros 8-bit,no parity, 1 stop bit Intergral 16-byte FIFO buffers by Ken Chapman Xilinx Ltd January 2003 http://grace.evergreen.edu/dtoi/arch06w/asm/KCPSM3/Docs/UART_Manual.pdf I put my hardware support engineer onto this uart version while I continued here with another uart model and low and behold he was able to actually get a Process "Generate Programming File" completed successfully for the first time in months using this template. Hence the reason I'm not totally familiar with the 'buck brigade" fifo, but if it's what I think it is, I believe my character buffer for the message is already in place. also let me make sure I've got some definitions right: a tx shift register - the storage location that is actually worked on by the TX mechanism shifting one bit at a time onto the TX wire, changing the voltages high and low at the proper time in accordance with the Baud rate. a tx holding register - this is the queue for characters to be sent out. on the completion of the tx shift register sending out the current character, the lead value in this register becomes/copies to the shift register for transmission. a character source - characters are copied to the holding register in order using an index into this "array" of 8-bit ASCII characters. I'm gonna take a few minutes to digest both of your two comments, clarify what exactly I have already, I'll be back shortly. Sincerely, Jon
Rick, Jonathan,

First of Jonathan, its far from long-winded waffle. You are absolutely
recognizing my dilemma/misconceptions.  What
you are saying is absolutely correct, I've got to get it into my way
of thinking, I will keep working at it.

I see EE guys tremble in fear at a double nested bubble sort
algorithm, something I take for granted, and here I am with
the rolls reversed on something as silly as putting out the "hello
world" message on a screen...

Rick, I don't believe it's the 16550 Uart, I've been through so many
looking for something I could use as a template, I'm bleary
eyed.  This is the one I got working:

 http://grace.evergreen.edu/dtoi/arch06w/asm/KCPSM3/

its the:

UART Transmitter and REceiver Macros

8-bit,no parity, 1 stop bit
Intergral 16-byte FIFO buffers

by Ken Chapman
   Xilinx Ltd
   January 2003

http://grace.evergreen.edu/dtoi/arch06w/asm/KCPSM3/Docs/UART_Manual.pdf

I put my hardware support engineer onto this uart version while I
continued here with another uart model and
low and behold he was able to actually get a Process "Generate
Programming File" completed successfully
for the first time in months using this template.

Hence the reason I'm not totally familiar with the 'buck brigade"
fifo, but if it's what I think it is, I believe my
character buffer for the message is already in place.

also let me make sure I've got some definitions right:

a tx shift register - the storage location that is actually worked on
by the TX mechanism shifting one
                           bit at a time onto the TX wire, changing
the voltages high and low at the proper time
                           in accordance with the Baud rate.

a tx holding register  - this is the queue for characters to be sent
out. on the completion of the tx shift register
                                sending out the current character, the
lead value in this register becomes/copies to the
                                shift register for transmission.

a character source - characters are copied to the holding register in
order using an index into this "array" of
                              8-bit ASCII characters.

I'm gonna take a few minutes to digest both of your two comments,
clarify what exactly I have already, I'll be back shortly.

Sincerely,

Jon
On Tue, 27 Jan 2009 08:26:44 -0800, Mike Treseler wrote:

>The reference design here: >http://mysite.verizon.net/miketreseler/ >is a single process >(aka: sequential, linear, single threaded) >vhdl design that just happens to be a uart.
Mike, you know I'm a strong supporter of the design style you advocate, although I'm less persuaded than you are by the benefits of parameterless procedures. However, that doesn't affect the key point I made in an earlier post: once you have encapsulated some functionality in an HDL process, your ability to do any form of sequential/procedural composition is gone. This, I believe, is a fundamental problem with HDLs that will not go away until something with the expressive power of CSP/occam surfaces in the HDL world. Maybe that's already happened with MyHDL and it's escaped me. Maybe not. Of course, everyone who knows what they're doing has perfectly good ways to deal with this problem. It's not a show-stopper. But it presents a fundamental barrier to the development of hardware description beyond a crude block-level process-by-process approach. Yours more in hope than in expectation, -- Jonathan Bromley, Consultant DOULOS - Developing Design Know-how VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK jonathan.bromley@MYCOMPANY.com http://www.MYCOMPANY.com The contents of this message may contain personal views which are not the views of Doulos Ltd., unless specifically stated.
Jonathan Bromley wrote:

> you know I'm a strong supporter of the design > style you advocate, although I'm less persuaded > than you are by the benefits of parameterless > procedures.
I use procedures to fix the process template and otherwise to avoid cut and paste errors on repeated blocks of code.
> However, that doesn't affect the > key point I made in an earlier post:
> once you have encapsulated some functionality > in an HDL process, your ability to do any form > of sequential/procedural composition is gone.
Aye, theres the rub. All else is pesky wires. I make processes as large as I can tolerate, and avert my eyes from their rough exteriors.
> This, I believe, is a fundamental problem with > HDLs that will not go away until something with > the expressive power of CSP/occam surfaces in > the HDL world.
... and some device vendors buy in.
> Of course, everyone who knows what they're doing > has perfectly good ways to deal with this problem.
Yet everyone's code is uniquely difficult to read.
> It's not a show-stopper. But it presents a > fundamental barrier to the development of > hardware description beyond a crude block-level > process-by-process approach.
As does the economy.
> Yours more in hope than in expectation,
Maybe things will line up better after the next big bang. -- Mike Treseler
Jonathan Bromley <jonathan.bromley@MYCOMPANY.com> writes:

<snip>
> (Historical note for old grumps like me: Thirty > years ago, occam and CSP had this sorted out, with > arbitrary sequential and parallel composition of > processes at all levels of the hierarchy.
I bought a book on Transputers (Transputer system design or some such) in a 2nd hand bookshop on holiday a few years ago. It all seemed so sensible! Does that make me an old grump too I wonder?
> But people who said "never liked threads" killed it dead.
Ho hum :(
> One day the wheel will turn, but not until the C programmer hegemony > is smashed.) >
Is that more or less likely now that Mentor has bought the occam/CSP-derived Handel-C from Agility? http://eetimes.eu/uk/212902105 Will CatapultC gain some explicit parallelism? Cheers, Martin -- martin.j.thompson@trw.com TRW Conekt - Consultancy in Engineering, Knowledge and Technology http://www.conekt.net/electronics.html
"Martin Thompson" <martin.j.thompson@trw.com> wrote in message 
news:uocxrkagc.fsf@trw.com...
> Jonathan Bromley <jonathan.bromley@MYCOMPANY.com> writes: > > <snip> >> (Historical note for old grumps like me: Thirty >> years ago, occam and CSP had this sorted out, with >> arbitrary sequential and parallel composition of >> processes at all levels of the hierarchy. > > I bought a book on Transputers (Transputer system design or some such) > in a 2nd hand bookshop on holiday a few years ago. It all seemed so > sensible! Does that make me an old grump too I wonder? > >> But people who said "never liked threads" killed it dead. > > Ho hum :( > >> One day the wheel will turn, but not until the C programmer hegemony >> is smashed.) >> > > Is that more or less likely now that Mentor has bought the > occam/CSP-derived Handel-C from Agility? > > http://eetimes.eu/uk/212902105
It is less likely since Mentor is not the only company that is trying to crack the panacea of hardware design using a sequential language. Cadence recently announced their C to Silicon compiler and only a few days ago Synfora announced a 250% revenue growth. All I can say is *great*, I am a strong believer that this is the way forward. The human brain is not that well suited to think concurrently and hence engineers tend to write many more correct lines of code in a sequential language than in an HDL language. Unfortunately as far as I know all these tools still only work on datapath and sorting out the control part is a very difficult nut to crack.
> > Will CatapultC gain some explicit parallelism?
It is highly unlikely CatapultC will support Handel-C which is a good reason to a certain degree, the whole point of these tools is to tell you which resources can run in parallel rather than you telling the tool. Hans www.ht-lab.com
> > Cheers, > Martin > > -- > martin.j.thompson@trw.com > TRW Conekt - Consultancy in Engineering, Knowledge and Technology > http://www.conekt.net/electronics.html
HT-Lab <hans64@ht-lab.com> wrote:
 
> It is less likely since Mentor is not the only company that is trying to > crack the panacea of hardware design using a sequential language. Cadence > recently announced their C to Silicon compiler and only a few days ago > Synfora announced a 250% revenue growth. All I can say is *great*, I am a > strong believer that this is the way forward. The human brain is not that > well suited to think concurrently and hence engineers tend to write many > more correct lines of code in a sequential language than in an > HDL language.
Unfortunately computers aren't very good at abstract thinking, such as is required to turn a sequential algorithm into a non-sequential algorithm.
> Unfortunately as far as I know all these tools still only work on datapath > and sorting out the control part is a very difficult nut to crack.
For the easy cases I can imagine it, or it may just result in a huge block of hardware where each sub-block is used only once. That doesn't really help much. -- glen