• AVR Freaks

Helpful ReplyWhat should a custom class bulk transfer rate be on the 24FJ256GB106?

Author
dlc@frii.com
Super Member
  • Total Posts : 370
  • Reward points : 0
  • Joined: 2006/03/03 10:49:45
  • Location: 0
  • Status: offline
2011/03/22 12:31:53 (permalink)
0

What should a custom class bulk transfer rate be on the 24FJ256GB106?

I've been searching for a couple hours and find no joy in this question. 

I have a project (not mine) that I'm integrating with that uses the PIC24FJ256GB106 as a USB device.  I'm finding that my data transfer rates are pitiful.  On the order of about 2KB/s.  The code uses a bulk transfer endpoint and on the PC the "Simple_USB" Microsoft project that appears to use Jan Axelson's code.  On the PIC, the developer used the Microchip USB custom class source.

The device does enumerate and work on the USB bus, but its throughput is horrific.  My timing shows that it takes 250ms+ to move 500 bytes from the device to the PC.  In reality, this USB device gets the data another over a 200Kbps RS485 link, then it sends the data to the PC.

My study of the problem tells me that if the bulk transfer can have a maximum payload of 64 bytes, and the USB timeslots are 1ms, then if the planets all align I should get 64KBps.  I get about 2KBps.  I know that getting the max is pretty unlikely, but still, 2KBps? Clearly something is wrong here.  I don't have a USB bus analyzer so I'm hoping others have seen issues and solved them and will share their wisdom.

How can I get better than 64KBps bulk transfer rates?  Multiple endpoints?  It seems there is a large delta between what I am understanding and what the USB spec says it can do...

Many thanks,
DLC
#1
chinzei
Super Member
  • Total Posts : 2250
  • Reward points : 0
  • Joined: 2003/11/07 12:39:02
  • Location: Tokyo, Japan
  • Status: offline
Re:What should a custom class bulk transfer rate be on the 24FJ256GB106? 2011/03/22 13:32:24 (permalink) ☄ Helpful
+3 (2)

The principle to get better bulk transfer speed is simple.

On the device side,
- Keep full-size (64 bytes) transactions, as many as possible, because short packet terminates the transfer.

On the PC applications,
- Request as large transfer size as possible for the device driver.


If you aren't sure above USB terms "transfer" and "transaction" exactly mean, read this post.

Transfer - transaction - packet
http://www.cygnal.org/ubb/Forum9/HTML/001627.html



I've explained on this theme everywhere.
I'm bored to write a new one for this forum, so make a dead copy from my post to other forum Smile

The original post is here.
https://my.st.com/public/...0Library%20Performance

> With your test as it stands, you are testing the latency of Windows USB transfers, not the performance of the library.

Neither testing "the performance of the library" nor "testing the latency of Windows USB transfers"
WinUSB hits more than 20M bytes/s with high-speed devices. It means PC device driver is not the bottleneck, here.
You are examining bus scheduling on the host controller (HC) hardware. This result on above duncan's post is a typical case which shows how bus scheduling works on a full-speed HC. (Because of loopback, the speed is shown in half of typical case)

// Tested on USB 1 and USB 3
// write size (bytes) = data rate
// 16384 = 547 kb/s
//  8192 = 512 kb/s
//  4096 = 456 kb/s
//  2048 = 409 kb/s     
//  1024 = 340 kb/s - < 500 writes per second
//   512 = 256 kb/s - 500 writes per second
//   256 = 127 kb/s
//   128 = 64 kb/s
//    64 = 32 kb/s
//    32 = 16 kb/s
//    16 = 8 kb/s
//     8 = 4 kb/s
//     4 = 2 kb/s
//     2 = 1 kb/s
//     1 = 0.5 kb/s

Full-speed HC (UHCI, OHCI) schedules next transfer at the end of each USB frame. Therefore, transfer speed increases proportionally along with the transfer size, for smaller transfer size.
ex. 1 byte transfer results in 1 byte/frame (ms). 512 bytes transfer gives 512 bytes/frame - one transfer per one frame.
A frame saturates in 19 full-size transactions (19 x 64 bytes) for full-speed, theoretically. Because of this saturation, greater transfer size doesn't increase the transfer speed so much.

[Tips 1]
When the full-speed device connects to a PC over a USB2.0 hub, you'll see faster speed in the smaller transfer size. USB2.0 hub converts full-speed transactions into high-speed ones. High-speed HC (EHCI) schedules in micro-frame (125us), instead of frame (1ms)

[Tips 2]
WinUSB RAW_IO policy breaks this transfer-per-frame restriction for bulk IN transfers. To make this feature work, multiple OVERLAPPED WinUsb_ReadPipe() calls are issued in advance, without waiting for finish of any single call.


When we recognize this HC scheduling, the principles are simple to get better transfer speed.

On the device side,
- Always exchange full-size (64 bytes) transactions
- Double buffer increases the performance

Short transaction (less than 64 bytes) terminates the transfer. Once a transfer terminates, we have to wait for the next frame to start next transfer. Therefore, keep the transactions in full-size to maximize the transfer speed.

On the PC applications,
- Request as large transfer size as possible for the device driver.

When entire data transfer is split into shorter ones, the gaps between transfers reduce the speed performance. WinUSB accepts some 10M bytes transfer in single WinUsb_ReadPipe/WritePipe call.


In the real applications, however, we may need to modify this simple principles to satisfy other requirements.
Here are three typical examples.

a) ROM writer
On the device side, ROM is read out / written to any time, quickly, when required. We don't need any buffer greater than full-size packet (64 bytes), two buffers for double-buffering.
On the PC application, when the target ROM size is 1M bytes or so, single read/write call does the job. For ROMs of greater size, we may split the call into shorter chunks, just to show progress bar.

b) Data streaming of ADC / DAC
ADC generates data in regular sampling interval. DAC has to be fed regularly without any gap. But bulk transfer speed may fluctuate, affected by activity of other devices on the bus. On the device side, buffers of enough size are equipped for IN (ADC) and OUT endpoints (DAC), to absorb speed fluctuations. To keep the transactions in full size, the buffer size is tuned to a multiple of 64 bytes (64 x N bytes)
On the PC side, suppose that a PC data logger displays the ADC result in real time for user's eyes. In this application, PC app requests shorter chunks of data to refresh the data display continuously. The chunk size is tuned to fit to display refresh rate.

c) USB-serial converters (USB-UART, USB-I2C, USB-SPI, USB-CAN, etc)
The traffic of these serial communications takes place sometimes in burst or sometimes in sporadic. We can't assume regular interval.
To prevent data drop in burst traffic, buffers of enough size are assigned to IN and OUT, like above case b).  UART RX does not always come in 64 bytes chunk. It may come in sporadic interval, and in short chunks. If the device would always wait for 64 bytes on the buffer, no response could pass from the device to PC for long interval. To ensure the deadline response, a latency timer forces the packet transfer, even when the buffer has less than 64 bytes.

FTDI appnote explains on this configuration well.
http://www.ftdichip.com/S...32b_04smalldataend.htm

Tsuneo
#2
dlc@frii.com
Super Member
  • Total Posts : 370
  • Reward points : 0
  • Joined: 2006/03/03 10:49:45
  • Location: 0
  • Status: offline
Re:What should a custom class bulk transfer rate be on the 24FJ256GB106? 2011/03/23 08:03:40 (permalink)
0
chinzei

The principle to get better bulk transfer speed is simple.

On the device side,
- Keep full-size (64 bytes) transactions, as many as possible, because short packet terminates the transfer.

On the PC applications,
- Request as large transfer size as possible for the device driver.

[snip]

Tsuneo


Tsuneo,

  Thank you VERY much for your time and help.  I and my teammates working on this projects are new to USB and as with all new concepts are struggling to understand while feverishly working to create a viable product, at the same time.  Would you recommend Jan Axelson's "USB Complete" as the tome we can use to speed our introduction to USB?

Many Thanks,
DLC

#3
chinzei
Super Member
  • Total Posts : 2250
  • Reward points : 0
  • Joined: 2003/11/07 12:39:02
  • Location: Tokyo, Japan
  • Status: offline
Re:What should a custom class bulk transfer rate be on the 24FJ256GB106? 2011/03/24 15:02:35 (permalink)
0

Would you recommend Jan Axelson's "USB Complete" as the tome we can use to speed our introduction to USB?

Jan's "USB Complete" is a must for USB development.
This book is an encyclopedia of USB - A book to find answers of your USB questions, not a book to read from cover to cover.

When you are working on a USB stack like Microchip Application Library, most of the USB details are hidden under the stack. But there is still USB manner you have to know (like above transfer-transaction relation). When you come across some questions on USB development, ask it on this forum. The discussion gives you clues to understand the background of your problem. And then, you'll get deeper understanding on Jan's book and the USB spec. When you fully realize the background, the best solution of your problem appears in your mind by itself.

USB specs on USB.org
http://www.usb.org/developers/docs

Approved Class Specification Documents on USB.org
http://www.usb.org/developers/devclass_docs

Tsuneo
post edited by chinzei - 2011/03/24 15:18:09
#4
Jump to:
© 2019 APG vNext Commercial Version 4.5