Altera Forum






Threads: 18,959
Posts: 77,484
Members: 29,244
Welcome to our newest member, pearl87
User
Reputation
9135
7620
5891
4150
3030
2197
2056
1706
1388
1300




 
Register
Quick Search
 
  Altera Forums > IP and Dev Kit Related > IP Discussion

How Can I set up DMA operation with my own PC software application?

Reply
 
Thread Tools Display Modes
  #1  
Old July 22nd, 2008, 08:49 AM
huzj_ecc huzj_ecc is offline
Altera Pupil
 
Join Date: Jun 2008
Posts: 8
Rep Power: 823
huzj_ecc is on a distinguished road
Post How Can I set up DMA operation with my own PC software application?

Hi All:

I want to re-use the pcie_highperformancedesign example provided by the Arria GX Development Kit. Now I am confused with the PC software application altpcie_demo.exe.
I am trying to control FPGA to initiate dma read and write operation just like altpcie_demo does with my own PC software application but failed.

Firstly, I used Jungo Windriver to generate a pcie driver. With the API functions provided by the driver I can access(R/W) configure registers ,memory bar 1:0(the syncram) and bar2(dma control registers).

Secondly, I creat a Read Descriptor Table--Header+2 Descriptors and set data(Length,Ep mem addr, RC mem addr)for desciptors. The header has four dw(DW0,DW1,DW2,DW3). For DMA Read, I set DW0=0x00040002,DW1=0,DW2=addr of header,DW3=0x1. Then I write DW0 to Bar2+0x10,DW1 to Bar2+0x14,DW2 to Bar2+0x18,DW3 to Bar2+0x1c.

My first question: Where(mem addr) can I poll the RCLast value to indicate the completion of DMA read?

Thirdly, I want to transfer the DMA Read data back to PC. I creat a Write Descriptor Table--Header+2 Descriptors.In each descriptor, I set PC mem addr for write back data and addr of EP mem correctly .The header has four dw(DW0-DW3). For DMA Write, I set DW0=0x00050002,DW1=0,DW2=addr of Header,DW3=0x1. Then I write DW0 to Bar2+0x0,DW1 to Bar2+0x4,DW2 to Bar2+0x8,DW3 to Bar2+0xc.

At the end, I checked the write back data and found that the write back data are all zeros.It seems like that the FPGA does nothing at all.

What are the detailed steps I should follow to set up the DMA operation correctly? I read the pci express compiler doc but didn't get enough information about software application.

Thanks a lot for any help.

Last edited by huzj_ecc : July 22nd, 2008 at 08:52 AM.
Reply With Quote
  #2  
Old July 22nd, 2008, 11:55 AM
heppermann heppermann is offline
Altera Pupil
 
Join Date: Jul 2008
Posts: 10
Rep Power: 798
heppermann is on a distinguished road
Default Re: How Can I set up DMA operation with my own PC software application?

What I found confusing in the 'PCI Express Compiler Users Guide' on pg 6-19 was the relation of the 5 step process to kick off the DMA and the Chaining DMA Descriptor Table. The 5 step process sounds like the implementation of the Simple DMA. If it is not, what do the terms PCI Express address (step 1) and master memory block (step 2) refer to?

Does master memory block refer to Chaining DMA Descriptor Table's offset in BAR2?

A few other questions I had about the example are:
  • Are the Descriptor Tables supposed to be written into the shared memory assigned to BAR2?
    • Figure 6-3 shows the descriptor tables in RC memory. If this is the case, how does the Arria access these data structures? Is the RC memory in Figure 6-3 an implicit shared memory block?
  • pg 6-19 says 'The software application writes the descriptor header into the into the endpoint header descritor register'. Table 6-7 maps the descritor headers to endpoint addresses 0x00 thru 0x20. These memory spaces conflict with the 5 step process on pg 6-19 to kick of the DMA. It looks like I am confusing something here. Does anyone know?
Any help would be appreciated. Thanks.
Reply With Quote
  #3  
Old July 23rd, 2008, 10:10 AM
huzj_ecc huzj_ecc is offline
Altera Pupil
 
Join Date: Jun 2008
Posts: 8
Rep Power: 823
huzj_ecc is on a distinguished road
Red face Re: How Can I set up DMA operation with my own PC software application?

I think the correct method is the software application writes Descriptor Table Header into Bar2(or3) mapped endpoint header descriptor registers at offset 0x00-0x1c.

PCIe compiler 7.2 User Guide page 6-17 said,"altpcie_dma_prg_reg-This module contains the descriptor header table registers which get programmed by the software application.This module collects PCI Express transaction layer packets from the software application with the TLP type Mwr on Bar2 or 3" and "Header register module-RC programs the descriptor header(4 DWORDS) at the beginning of the DMA".

The next paragraph,"altpcie_dma_descriptor-This module retrieves the DMA read or write descriptor from the root port memory,and stores it in descriptor FIFO.This module issues PCI Express transaction layer packets to the BFM shared memory with the TLP type MRd".

In the simulation model,the Root Port BFM sources data(descriptors) for completions in response to read transactions received from the PCIE link,I think. But in the software application ,which module will response to the altpcie_dma_descriptor issued MRd TLP? Does the Jungo Pcie driver response automatically? Or should I write codes to deal with such MRd TLP in my software app?

At page 6-18,Table 6-4 descripted the Bar/Address map. Should I set Bar4(or5) if I want to use the rc_slave module in the example to bypass the chaining dma? But Bar0(or1) is also descripted to be used for rc_slave module.A mistake?

I have so many questions with the chaining dma example.I am wondering why Altera not release the source code of pcie software application such as altpcie_demo.exe.

Thanks for reply.

Last edited by huzj_ecc : July 23rd, 2008 at 10:15 AM.
Reply With Quote
  #4  
Old July 23rd, 2008, 11:29 AM
heppermann heppermann is offline
Altera Pupil
 
Join Date: Jul 2008
Posts: 10
Rep Power: 798
heppermann is on a distinguished road
Default Re: How Can I set up DMA operation with my own PC software application?

  • You said: I have so many questions with the chaining dma example.I am wondering why Altera not release the source code of pcie software application such as altpcie_demo.exe.
You are correct, all of this confusion would be eliminated if they would release this source code (driver source might be needed as well). Do you think we need to start a new thread to explicitly ask for this?
  • You said: I think the correct method is the software application writes Descriptor Table Header into Bar2(or3) mapped endpoint header descriptor registers at offset 0x00-0x1c.
After reading PCI Express Compiler Users Guide, I thought the same thing, but then I started looking at the bus function model (BFM) driver source code to see how a DMA simulation is performed, and now I think otherwise. For example, look at the file 'C:\altera\72\kits\ArriaGX_PCIe\Examples\PCIe_High PerformanceDesign\Quartus\top_x4_examples\chaining _dma\testbench\altpcietb_bfm_driver_chaining.v'
I believe this file is one of the higher level bfm driver routines. If you look at the file you can find the following (some parts omitted for brevity):

##########BEGIN CODE################
// Run the chained DMA write
task dma_wr_test(...);
begin
// write 'write descriptor table in the RC Memory
dma_set_wr_desc_data(bar_table, setup_bar);

// Write Descriptor header in EP memory PRG
dma_set_header( ... )

end
##########END CODE################

If you look at the called functions dma_set_wr_desc_data() claims it writes the descriptor table in root complex (PC / host) memory.
Also, the comments above dma_set_header() function shows descriptor header tables for endpoint and root complex memory. The documentation almost reads like there is one descriptor header table mapped to a BAR 2. All in all, the code is not clear for porting to an actual implementation because I cant tell if shared memory means BFM driver memory or memory mapped by a BAR (or they are the same thing in an actual implemenation).

I wish there was a document that explained the reference design a little more for the vantage point of someone that wants to modify the existing design, and not from the BFM vantage point. The BFM blurs what needs to be done by a PC and what is contained in the reference design.

Best of luck, it appears we both need some right now.
Reply With Quote
  #5  
Old July 23rd, 2008, 05:02 PM
Hey_Hey Hey_Hey is offline
Altera Scholar
 
Join Date: Sep 2007
Posts: 23
Rep Power: 1123
Hey_Hey will become famous soon enough
Default Re: How Can I set up DMA operation with my own PC software application?

Maybe I can clear up a few things...

The descriptor tables are located in the system's host memory (also known as root complex memory or BFM Shared Memory). See figure 7-2 of the 8.0 PCIe Compiler User Guide. I'm not a software guy, but you will need to get the system to lock down that memory and give you the real physical memory address of it (not the virtual address the application would use). Same thing you need to do with the actual memory buffer data you want to transfer via DMA. The addresses in the descriptor table point to the data buffers to be transferred. The descriptor table entries are described by tables 7-6, 7-7 and 7-8.

Then you must write the real physical address of the descriptor table to the Descriptor Table Header registers which are offset from BAR2 (or BAR3:2) by the values shown in table 7-5. The Descriptor Table Header format is shown in tables 7-3, 7-4, and 7-5.

The Chaining DMA hardware will then read the Descriptor Table using MRd TLP's from the system host memory, using the address from the Descriptor Table Header register. The root complex hardware will automatically respond to the MRd TLP and return the data from the memory address. (huzj_ecc - your driver doesn't need to respond to the MRd TLP, in fact there is no way to do that, you just have to have the descriptor table locked down in memory and put the correct address in the Decriptor Table Header register.)

It does appear that the PCIe Compiler user guide is missing an important piece of information on how this is all setup. The organization of the actual descriptor table:

Byte Offset Field
0-13 Reserved
14-15 EPLAST
16-31 Descriptor #1 (following format of tables 7-6, 7-7, and 7-8)
32-47 Descriptor #2 (ditto)
48-63 Descriptor #3 (ditto)
..... and so on for as many descriptors as specified by the "Size" field
in the descriptor table header register


I think the Descriptor Table must also be no more than 4KB in total size and can't cross a 4KB boundary.

The EPLAST field in the Descriptor Table is updated by the Chaining DMA hardware with the number of the last descriptor that was completed, when the hardware is enabled to do so by the EPLAST_ENA bit in the Descriptor Table Header register or the EPLAST_ENA bit in the actual descriptor.

heppermann - Yes, it looks like those steps you mentioned in the user guide are leftover from the previous simple DMA description.

I think I answered most of the questions with the above description. Please post any followups here. I will try to answer if I know the answer and when I can.
Reply With Quote
  #6  
Old July 31st, 2008, 12:29 PM
heppermann heppermann is offline
Altera Pupil
 
Join Date: Jul 2008
Posts: 10
Rep Power: 798
heppermann is on a distinguished road
Default Re: How Can I set up DMA operation with my own PC software application?

Hello Hey Hey,
Thank you for your response, that was very informative and cleared up a lot.

I have atleast one more point of confusion. There are two Chaining DMA Descriptor Headers at offset 0x00 and 0x10. The first for write and the other for read. Why is there a Direction bit in the Control Fields (Table 7-4 of PCI Express Compilers Users Guide 8.0)? Is this a redudant thing, or is there some significance to this bit. To me, I would assume the registers at 0x00 and 0x10 specify the direction.

Thanks.
Reply With Quote
  #7  
Old July 31st, 2008, 07:37 PM
likewise likewise is offline
Altera Scholar
 
Join Date: Jul 2008
Posts: 21
Rep Power: 782
likewise is on a distinguished road
Default Re: How Can I set up DMA operation with my own PC software application?

At least in Linux, -1 (or 0xffff...) is the default value when you are reading non-existent I/O or memory mapped locations. Not sure if this goes for PCI as well.

However, we cannot guess what you might be doing wrong without seeing source code.
Reply With Quote
  #8  
Old July 31st, 2008, 09:22 PM
heppermann heppermann is offline
Altera Pupil
 
Join Date: Jul 2008
Posts: 10
Rep Power: 798
heppermann is on a distinguished road
Default Re: How Can I set up DMA operation with my own PC software application?

Hello,
Below is the minimal amt of code to test writing then reading BARs (or it is at least what I think it should be).

At the end of the probe() method I write 0x05 to the first bytes of BAR 0 and BAR 2 but read back 0xFF. I am sure the probe() method is actually getting called b/c I notice my print statements are being called, and I have used Linux pci_resource_length() calls and have found that the dma control BAR is 1KB and the dma space BAR is 16MB. Also, I have done some more elaborate tests where I do iowriteXX() followed by ioreadXX(), but I always read back 1's.


code
#############################################

typedef struct kayak_info {
char name[DEVICE_NAME_SIZE];
struct pci_dev *pdev;
// Lock on register access
struct rw_semaphore ep_sem;
struct rw_semaphore sem_f;

/* Char device structure */
struct cdev cdev;
/* Class device */
struct class_device *class_dev;

/* Linked lists of endpoints */
struct list_head endpoint_list;

void __iomem *dma_ctrl;
void __iomem *dma_space;

u64 stream_handles;
};

static struct pci_device_id kayaks[ ] = {
{ PCI_DEVICE(0x1172, 0xE001) },
//empty set to know where end of list is
{ 0, },
};

//forward declarations
static int kayak_probe(struct pci_dev *pdev, const struct pci_device_id *ent);
static void __devexit kayak_remove(struct pci_dev *pdev);

static struct pci_driver kayak_driver = {
.name = wdt_kayak_driver_name,
.id_table = kayaks,
.probe = kayak_probe,
.remove = kayak_remove,
};

static int kayak_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
{
struct kayak_info *kayinfo;
int k;
unsigned char irq;
int ret;
int regNum;
dev_t dev;
int maj;
int min;
unsigned long mmio_flags;
u32 testval;
u16 vid;

printk(KERN_WARNING "kayak: **** kayak_probe() method ****\n");

//create a struct to hold important device info
kayinfo = kayak_alloc();

if (kayinfo == NULL)
{
printk(KERN_WARNING "kayak: kayak_alloc() returned NULL\n");
ret = -ENOMEM;
goto err_kay_alloc;
}

//get a device minor
dev = get_dev_t();

//before the driver can access any device resource
//(I/O region or interrupt) of the PCI device,
//the driver must call the pci_enable_device function:
/* enable device (incl. PCI PM wakeup), and bus-mastering */
ret = pci_enable_device (pdev);
if (ret < 0)
{
printk(KERN_WARNING "kayak: error pci_enable_device = %d\n", ret);
goto err_enbl_out;
}
else
printk(KERN_WARNING "kayak: enable success\n");

kayinfo->pdev = pdev;

init_rwsem(&kayinfo->ep_sem);
init_rwsem(&kayinfo->sem_f);
kayinfo->stream_handles = 0;

kayinfo->dma_space = NULL;
kayinfo->dma_ctrl = NULL;

ret = pci_request_regions(pdev, "kayak");
if (ret) {
printk( KERN_WARNING "kayak: error requestings regions\n" );
goto err_release_pcireg;
}

//I have a 32bit PC, so only worry about BAR 0 and not BAR 1:0
kayinfo->dma_space = pci_iomap(pdev, 0, 0);
if (kayinfo->dma_space == NULL) {
goto err_release_pcireg;
}

kayinfo->dma_ctrl = pci_iomap(pdev, 2, 0);
if (kayinfo->dma_ctrl == NULL) {
goto err_release_pcireg;
}

//use this device in bus mastering mode since this card
//is capable of DMA
pci_set_master(pdev);
//set device to master
printk( KERN_WARNING "kayak: set pdev to master\n");

if( !( ret = pci_set_dma_mask(pdev, DMA_64BIT_MASK))) {
// system supports 64-bit DMA
printk( KERN_WARNING "kayak: system supports 64-bit DMA\n" );
} else {
if((ret = pci_set_dma_mask(pdev, DMA_32BIT_MASK) )) {
printk( KERN_WARNING "kayak: No usable DMA config\n");
goto err_release_pcireg;
}
printk( KERN_WARNING "kayak: system supports 32-bit DMA\n" );
}


//discover resource information...

//find the irq #
pci_read_config_byte(pdev, PCI_INTERRUPT_LINE, &irq);
printk(KERN_WARNING "kayak: irq# = %d\n", irq);

kayinfo->class_dev = class_device_create(kayak_class, NULL, dev, NULL, "kayak%d", MINOR(dev));

pci_set_drvdata(pdev, kayinfo);

//TODO: implement error handling
//connect file operations with cdev
cdev_init(&kayinfo->cdev, &kayak_fops);

// connect the major/minor # with cdev
ret = cdev_add ( &(kayinfo->cdev), dev, 1);
if (ret)
{
printk( KERN_WARNING "kayak: err setting up kayak cdev\n");
goto err_cdev;
}

// !! test writing then reading of BARs !!
writeb(5, kayinfo->dma_ctrl);
wmb();
testval = readb(kayinfo->dma_ctrl);
printk(KERN_WARNING "kayak: read %d\n", testval);

writeb(5, kayinfo->dma_space);
wmb();
testval = readb(kayinfo->dma_space);
printk(KERN_WARNING "kayak: read %d\n", testval);


return 0;

err_cdev:
//destroy class device
class_device_destroy(kayak_class, dev);

err_release_pcireg:

if (kayinfo->dma_ctrl != NULL) {
iounmap(kayinfo->dma_ctrl);
pci_release_region(pdev, 2);
}

if (kayinfo->dma_space != NULL) {
iounmap(kayinfo->dma_space);
pci_release_region(pdev, 0);
}

err_enbl_out:

pci_disable_device(pdev);
put_kayak_dev_t(dev);

err_dev_t_get:
printk( KERN_WARNING "kayak: error out, ret = %d\n", ret);
kayak_unreg_childdevices(kayinfo);
err_child_dev_alloc:
kfree(kayinfo);
err_kay_alloc:
return ret;
}

static void __devexit kayak_remove(struct pci_dev *pdev)
{
struct kayak_info *kayinfo;
int k;
dev_t devno;
int min, maj;

kayinfo = pci_get_drvdata(pdev);

//access the device #
devno = kayinfo->cdev.dev;
//remove cdev
cdev_del( &(kayinfo->cdev) );

//frees resources
if (kayinfo->dma_space != NULL)
{
iounmap(kayinfo->dma_space);
pci_release_region(pdev, 0);
}

if (kayinfo->dma_ctrl != NULL)
{
iounmap(kayinfo->dma_ctrl);
pci_release_region(pdev, 2);
}

class_device_destroy(kayak_class, devno);
kfree(kayinfo);
}
static int __init jigapix_init_module(void)
{
int retval;

printk( KERN_WARNING "kayak: **** jigapix_init_module() ****\n" );
//set the max # of kayaks
max_kayaks = sizeof(u32)*8;
//no minor #s have been claimed yet
kayak_minors = 0;

//get a major
retval = alloc_chrdev_region(&kayak_major_dev, 0, max_kayaks, "kayak");

if (retval < 0) {
printk(KERN_WARNING "kayak: can't get major #\n");
return retval;
}

kayak_class = class_create(THIS_MODULE, kayak_class_name);

retval = pci_register_driver(&kayak_driver);
return retval;
}

static void __exit jigapix_exit_module(void)
{
int maj;
printk( KERN_WARNING "kayak: **** jigapix_exit_module() method ****\n" );
pci_unregister_driver(&kayak_driver);

class_destroy(kayak_class);

unregister_chrdev_region(kayak_major_dev, max_kayaks);
}

module_init(jigapix_init_module);
module_exit(jigapix_exit_module);

#################################################
end code
Reply With Quote
  #9  
Old July 31st, 2008, 09:28 PM
heppermann heppermann is offline
Altera Pupil
 
Join Date: Jul 2008
Posts: 10
Rep Power: 798
heppermann is on a distinguished road
Default Re: How Can I set up DMA operation with my own PC software application?

Hello Hey Hey,
The Linux command lspci does show my device, and further more the kernel registers this device with my driver so I am confident that the PnP features of PCI are operating correctly on my Arria dev board. It seems like the board is configuring correctly, hopefully member 'likewise' has a clue of what I am doing wrong.

Thanks.

PS. I bumped your rep points. I appreciate the continued help.
Reply With Quote
  #10  
Old August 4th, 2008, 06:48 PM
Hey_Hey Hey_Hey is offline
Altera Scholar
 
Join Date: Sep 2007
Posts: 23
Rep Power: 1123
Hey_Hey will become famous soon enough
Default Re: How Can I set up DMA operation with my own PC software application?

Hmm....You might not be doing anything wrong, this all might be 'features' of the chaining DMA hardware.

Whether there is actually anything behind the BAR0 is controlled by the USE_RCSLAVE Verilog generic on the altpcierd_example_app_chaining module in the example design. Depending on which version of the PCIe compiler you are using this may be set to either 0 or 1 by default. You can change it to a 1 and recompile your design to enable the memory behind BAR0.

When USE_RCSLAVE == 0 reads to BAR0 will not generate a completion on the PCIe link. The Root Complex (motherboard chipset) will timeout and probably return all FF's to the CPU.

Now the .sof file that comes with the dev kit should have this set to a 1, so the memory should be there.

But... I think the hardware may not respond completely correctly to a single byte read. The completion would probably still be for a Dword (4 bytes). The root complex may not like this and still return all FF"s to the CPU.

Even though I'm not a software guy, with a little help from Google it looks like the readb() function you are using is just a single byte read. So if your design has USE_RCSLAVE == 1 (like the development kit .sof), I suggest trying to use writel() and readl() instead of writeb() and readb(), to see if that works for accesses to BAR0.

Now as far as accesses to BAR2 go, it turns out those registers are write-only. The PCIe Compiler User Guide is just plain wrong on that. So reads to those will fail. The chaining DMA was designed to provide all of it's status through interrupts and writes to the host memory. Those registers are also never changed by the hardware so they always have the same value that was written by software. So there was no real functional need to have read-back, you just have to trust the hardware. Though I admit read-back would be nice for the "trust but verify" mindset.
Reply With Quote
Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Cancel DMA operation. Major General Discussion Forum 1 December 1st, 2009 05:47 AM
PCI with SGDMA operation newfrance IP Discussion 2 November 10th, 2008 11:30 PM
Weird Operation With Out PRINTF cfavreau General Software Forum 5 September 6th, 2006 08:24 PM
UART hardware operation arat016 General Discussion Forum 1 August 1st, 2006 05:45 AM
CF operation icesword General Software Forum 1 October 12th, 2005 03:49 PM


All times are GMT -8. The time now is 09:24 PM.