Reply by Luis Benites March 22, 20212021-03-22
On Tuesday, March 9, 2021 at 10:56:20 AM UTC-8, David Brown wrote:
> On 09/03/2021 03:52, Luis Benites wrote: > > On Monday, October 27, 2014 at 11:05:32 AM UTC-7, david....@gmail.com > > wrote: > >> What is the correct way to handle a PCIE request to a slow device? > >> > >> > >> I have a xilinx spartan 6 PCIe using Integrated Block for PCI > >> Express. > >> > >> The BAR memory map is decoded and some addresses map to fast ram, > >> or local registers and these work OK, but some addresses map to > >> slow devices.. like I2C or internal processes that need a few > >> cycles to process before they can produce valid data to be returned > >> to the PCI bus. > >> > >> Is there a way to tell the PCI bus to wait, or retry..? > >> > >> thanks > > > > Just in case this is what you are trying to so: stalling your whole > > system and all other PCIe accesses to wait for an i2c read should > > never be the solution to anything. You send your completion whenever > > it's ready. If it takes you longer than the spec to complete then you > > need to initiate the read in some other way (earlier), check for > > ready and only then issue the read you can complete on time. > > > Please look at the date of the post you are replying to. Do you think > someone will have been waiting over six years for an answer to a Usenet > post? It's nice that you are trying to help, of course.
Ha ha. Let's start a flame war over trying to help. Don't you have better use of your time? Anyone looking for CURRECT PCIe help with a google search will come across this post and get something from it. Nothing that was said is outdated.
Reply by David Brown March 9, 20212021-03-09
On 09/03/2021 03:52, Luis Benites wrote:
> On Monday, October 27, 2014 at 11:05:32 AM UTC-7, david....@gmail.com > wrote: >> What is the correct way to handle a PCIE request to a slow device? >> >> >> I have a xilinx spartan 6 PCIe using Integrated Block for PCI >> Express. >> >> The BAR memory map is decoded and some addresses map to fast ram, >> or local registers and these work OK, but some addresses map to >> slow devices.. like I2C or internal processes that need a few >> cycles to process before they can produce valid data to be returned >> to the PCI bus. >> >> Is there a way to tell the PCI bus to wait, or retry..? >> >> thanks > > Just in case this is what you are trying to so: stalling your whole > system and all other PCIe accesses to wait for an i2c read should > never be the solution to anything. You send your completion whenever > it's ready. If it takes you longer than the spec to complete then you > need to initiate the read in some other way (earlier), check for > ready and only then issue the read you can complete on time. >
Please look at the date of the post you are replying to. Do you think someone will have been waiting over six years for an answer to a Usenet post? It's nice that you are trying to help, of course.
Reply by Luis Benites March 8, 20212021-03-08
On Monday, October 27, 2014 at 11:05:32 AM UTC-7, david....@gmail.com wrote=
:
> What is the correct way to handle a PCIE request to a slow device?=20 >=20 > I have a xilinx spartan 6 PCIe using Integrated Block for PCI Express.=20 >=20 > The BAR memory map is decoded and some addresses map to fast ram, or loca=
l registers and these work OK,=20
> but some addresses map to slow devices.. like I2C or internal processes t=
hat need a few cycles to process before they can produce valid data to be r= eturned to the PCI bus.=20
>=20 > Is there a way to tell the PCI bus to wait, or retry..?=20 >=20 > thanks
Just in case this is what you are trying to so: stalling your whole system = and all other PCIe accesses to wait for an i2c read should never be the sol= ution to anything. You send your completion whenever it's ready. If it take= s you longer than the spec to complete then you need to initiate the read i= n some other way (earlier), check for ready and only then issue the read yo= u can complete on time.
Reply by David Binette November 4, 20142014-11-04
On Monday, October 27, 2014 1:05:32 PM UTC-5, David Binette wrote:
> What is the correct way to handle a PCIE request to a slow device? >=20 > I have a xilinx spartan 6 PCIe using Integrated Block for PCI Express. >=20 > The BAR memory map is decoded and some addresses map to fast ram, or loca=
l registers and these work OK,=20
> but some addresses map to slow devices.. like I2C or internal processes t=
hat need a few cycles to process before they can produce valid data to be r= eturned to the PCI bus.
>=20 > Is there a way to tell the PCI bus to wait, or retry..? >=20 > thanks
I have found some documents that specifically refer to 'throttling data on = the transmit path' http://www.xilinx.com/support/answers/21707.html "You can pause the transfer of packets between the user application and= the PCI Express Core by deasserting trn_tsrc_rdy_n. There is no limit to t= he number of cycles that trn_tsrc_rdy_n can be deasserted. The PCI Express = Core holds the packet in its transmit buffer until you finish moving the pa= cket into the core signified by the assertion of trn_teof_n. Once the compl= ete packet is stored inside the core, it is transmitted on the PCI Express = Link. You cannot directly affect the packet's transmission on the link thro= ugh trn_tsrc_rdy_n. However, if you deassert trn_tsrc_rdy_n excessively it = slows the overall bandwidth because the core does not have the packet to se= nd until you assert trn_teof_n.=20 NOTE: Currently, you must pause back-to-back TLPs by at least one cycle by = deasserting trn_tsrc_rdy_n. Please see (Xilinx Answer 21708) for more infor= mation." and http://www.xilinx.com/support/answers/21592.html "The user application input trn_tsrc_rdy_n should not be asserted low a= ll the time. It should only be asserted when the user application is involv= ed in a data transfer. It should be asserted at the same time as trn_tsof_n= and deasserted with trn_teof_n. It is permissible to insert wait states be= tween the assertions of trn_sof_n and trn_teof_n by deasserting trn_tsrc_rd= y_n." I have some screen snapshots of the PCIe signals involved.=20 1) the normal PCIe signals (this works) '99' is read on the PCI bus on the = linux system with about 600,000 reads/sec http://www.mediafire.com/view/3v81znw933hwtq4/throttle1.png 2) the throttled version 0xffffffff is read on the PCIe bus and the rate is= about 23 reads/sec it should read the value 98 http://www.mediafire.com/view/g8gc6r864c3ze3p/throttle2.png I added some extra lines,=20 'rd_wait_i' tells the IP core to wait until this line is de-asserted in operation it is intended to throttle the transmits of the=20 single 32 bit data value to be transmitted during=20 trn_tsrc_rd_n=3D0 and the trn_eof_n=3D0. 'req_data' tells my app that a read cycle is occurring If anyone sees something awry with 'throttle2.png' i'd sure like to know.
Reply by kkoorndyk October 31, 20142014-10-31
On Friday, October 31, 2014 6:58:53 AM UTC-4, Petter Gustad wrote:
> David Binette <david.binette@gmail.com> writes: >=20 > > I haven't put it on the simulator, just doing compiles and tests but th=
e turn time is long.
>=20 > Does Xilinx provide a realistic Root Complex model or some other type of > PCIe verification environment?=20 >=20 > Rolling your own can be some amount of work. However, it might be > possible to instantiate a Xilinx Root Complex in your testbench and use > that to stimulate your DUT. >=20 >=20 > //Petter >=20 >=20 > --=20 > .sig removed by request.
Yes, the example design provided with the PCIe EP Block contains a root por= t model.=20 I've recently worked a Spartan 6 design similar to the OP in which the FPGA= is a bridge between the processor over PCIe and a local bus with several p= eripherals. I started with the example design and modified the PIO Rx and = Tx engines to work for my application. Most of the local bus cycles are fa= st enough that software is not having to wait. A timeout was implemented o= n the local bus cycles that issues an MSI interrupt on the PCIe link if the= peripheral doesn't respond within the timeout period (~1 us). One issue w= e ran into WRT PCIe packet timing is that the MSI interrupt was not being s= een by software before the next transaction was issued on the link. We end= ed up using a status register for software to poll instead.
Reply by October 31, 20142014-10-31
David Binette <david.binette@gmail.com> writes:

> I haven't put it on the simulator, just doing compiles and tests but the turn time is long.
Does Xilinx provide a realistic Root Complex model or some other type of PCIe verification environment? Rolling your own can be some amount of work. However, it might be possible to instantiate a Xilinx Root Complex in your testbench and use that to stimulate your DUT. //Petter -- .sig removed by request.
Reply by David Binette October 30, 20142014-10-30
On Monday, October 27, 2014 1:05:32 PM UTC-5, David Binette wrote:
> What is the correct way to handle a PCIE request to a slow device? > > I have a xilinx spartan 6 PCIe using Integrated Block for PCI Express. > > The BAR memory map is decoded and some addresses map to fast ram, or local registers and these work OK, > but some addresses map to slow devices.. like I2C or internal processes that need a few cycles to process before they can produce valid data to be returned to the PCI bus. > > Is there a way to tell the PCI bus to wait, or retry..? > > thanks
Thanks Mark for your time and comments, which were helpful. I haven't put it on the simulator, just doing compiles and tests but the turn time is long.
Reply by Mark Curry October 29, 20142014-10-29
In article <b22fff2a-6bf2-4285-8632-7cda5fa59541@googlegroups.com>,
David Binette  <david.binette@gmail.com> wrote:
>On Monday, October 27, 2014 1:05:32 PM UTC-5, David Binette wrote: >> What is the correct way to handle a PCIE request to a slow device? >> >> I have a xilinx spartan 6 PCIe using Integrated Block for PCI Express. >> >> The BAR memory map is decoded and some addresses map to fast ram, or local registers and these work OK, >> but some addresses map to slow devices.. like I2C or internal processes that need a few cycles to process before they can produce valid data to be returned to the PCI bus. >> >> Is there a way to tell the PCI bus to wait, or retry..? >> >> thanks > > > >How do other ppl handle things like doing SMBus reads over PCIe or >an I2C device.. the first read is certainly going to need some time >to complete before it can return data. > >Perhaps I just fumbled something during my tests and subsequently discarded >what should have been a viable approach. >
David, I can't offer any specific advise - but generally all PCIE transcations are "stalled", whether they're reading from a slow device on another clock or a "fast" device on the same clock. For A PIO read you get: 1. The host issues a PIO read. 2. A TLP MRd packet is formed and sent across the serial interface. 3. The xilinx endpoint decodes the packet, determines that the packet is meant for the user logic - you. It sends the information out to the user interface logic. 4. Your logic issues the read, and responds. 5. The CPLd packet is formatted and transmitted back across the PCIE link. ... All of that takes quite a bit of time. The fact that step 4 takes a few cycles (give or take 10s or perhaps even 100s) is almost irrelavant. The PCIE time mechanism doesn't come into play until this number is very high (I've not used it, but I'd think we're talking 10s of ms) The whole process has quite a bit of latency. A few cycles here or there aren't going to matter. I don't use that specific PCIE core, nor Xilinx logic (I'm using the Virtex7 core, with AXIS interfaces tied to my logic). But the general flow should be the same. I'd review the interfaces specification to fully understand what's required. Are you running sims with the Xilinx logic? Regards, Mark
Reply by Chris Higgs October 29, 20142014-10-29
On Wednesday, October 29, 2014 2:33:54 PM UTC, David Binette wrote:

> That would be OK for most cases but some reads have side effects > , such as clearing another register upon read. This could be overcome > and is not a show stopper, that part could be redesigned.
It's generally best to avoid side-effects if at all possible and make all r= eads idempotent. Life is much easier for software that way. For example, TLPs may be re-ordered, accesses above a certain size may not = occur in the order you expect, the root complex may attempt to pre-fetch a = value, in future you may be using this device over a lossy medium like Ethe= rnet. All of these things can be controlled (or worked around) in software but of= ten lead to inefficiencies. If you have the choice, it's always better to = design your interface with a view to simplifying the software interaction. = This generally also yields simpler hardware and fewer gotchas in the docum= entation so everyone's a winner! Thanks, Chris
Reply by David Binette October 29, 20142014-10-29
On Wednesday, October 29, 2014 9:33:54 AM UTC-5, David Binette wrote:
> On Monday, October 27, 2014 1:05:32 PM UTC-5, David Binette wrote: > > What is the correct way to handle a PCIE request to a slow device? > > > > I have a xilinx spartan 6 PCIe using Integrated Block for PCI Express. > > > > The BAR memory map is decoded and some addresses map to fast ram, or local registers and these work OK, > > but some addresses map to slow devices.. like I2C or internal processes that need a few cycles to process before they can produce valid data to be returned to the PCI bus. > > > > Is there a way to tell the PCI bus to wait, or retry..? > > > > thanks > > > Hi Sean, > Thanks for the suggestions, but I think what I really need is a way > to stall the current TLP to allow the read/access to complete. > > -- Is it streaming data? Do you need to catch all the > -- data or do you want to read out only one single value occasionally? Is > > The data is always changing, and only needs to be read occassionally. > > -- You could just always transfer the data you have to the PCIe clock > -- domain whenever it changes. Each time there is a new value, always > -- transfer it to the PCIe clock domain immediately and put it e.g. into a > -- BAR register. So when you issue a PCIe read request, there's data > -- already there that you can put into your reply message immediately. > -- Worst case is you don't get the very latest value but the one before that. > > That would be OK for most cases but some reads have side effects > , such as clearing another register upon read. This could be overcome > and is not a show stopper, that part could be redesigned. > > also since the external device has a lot of registers and they are > typically accessed by setting their address and reading the result > (sometimes a calculated result) it would require significant changes > to create a bank of shadow values to capture them all for > instantaneous retrieval instead of indexed on-demand access > > How do other ppl handle things like doing SMBus reads over PCIe or > an I2C device.. the first read is certainly going to need some time > to complete before it can return data. > > Perhaps I just fumbled something during my tests and subsequently discarded > what should have been a viable approach. > > If I knew exactly how it should be done I could focus my efforts on that.
ps, i know that SMBus is an independant bus on the PCIe connector, I don't mean to complicate the topic with that. It was an example to illustrate only.