Difference between revisions of "Novena Issue Log"

From Studio Kousagi Wiki
Jump to: navigation, search
(Hardware)
(Bringup status by subsystem)
 
(94 intermediate revisions by 2 users not shown)
Line 7: Line 7:
 
! scope="col" | extended status
 
! scope="col" | extended status
 
| scope="col" class="unsortable" | notes
 
| scope="col" class="unsortable" | notes
 +
|-
 +
| Sleep/suspend || || || Need CI testbench for sleep/suspend to verify extended status
 
|-
 
|-
 
| PMIC base || OK ||  || voltages nominal, but on high side for CPU (1.35V) -- need PMIC DVFS driver to dial it back
 
| PMIC base || OK ||  || voltages nominal, but on high side for CPU (1.35V) -- need PMIC DVFS driver to dial it back
Line 14: Line 16:
 
| RTC backup battery || || ||
 
| RTC backup battery || || ||
 
|-
 
|-
| RTC || || || (extended status should also measure clock drift)
+
| RTC || OK || REV || 7-day run (619337 seconds to be precise) shows a drift of 52 seconds too fast = +84PPM stability referenced to Lenovo T61 RTC. New xtal being chosen for V2 due to obsoletion of current crystal, so will have to retest anyways. Run on SN007.
 
|-
 
|-
 
| debug console || OK || OK ||
 
| debug console || OK || OK ||
Line 20: Line 22:
 
| SDHC3 (microSD boot)|| OK ||  ||
 
| SDHC3 (microSD boot)|| OK ||  ||
 
|-
 
|-
| SDHC3 power switch || || ||
+
| SDHC3 power switch || OK || || works as-designed; no detail test of power-off since that kills the root partition, but the card does reset cleanly between reboots, which is the primary purpose of the switch.
 
|-
 
|-
| DDR3 base || OK ||  || functions at 1066 MT/s, 1 GB configured
+
| DDR3 base || OK || OK || functions at 1066 MT/s, 4 GB 1 and 2 rank configured, using http://memtester.sourcearchive.com/documentation/4.0.8/files.html for testing in userspace
 
|-
 
|-
| DDR3 extended || || || need to test multiple DIMM/mfg configs
+
| DDR3 extended || OK || OK || tested with 1 and 2 rank DIMMs, 1-4 GB configurations. Boards repeatedly surviving 200-hour stress testing runs.
 
|-
 
|-
 
| I2C1 (SMB)|| OK || || can read out DDR3 I2C config using i2cdump
 
| I2C1 (SMB)|| OK || || can read out DDR3 I2C config using i2cdump
Line 30: Line 32:
 
| I2C2 || OK ||  || can read out accelerometer, PMIC bits using i2cdump
 
| I2C2 || OK ||  || can read out accelerometer, PMIC bits using i2cdump
 
|-
 
|-
| I2C3 || FAIL || || may be driver issue
+
| I2C3 || OK || || Can talk to audio codec
 
|-
 
|-
| reset button || FAIL || || reset button hangs system, doesn't cause reboot
+
| reset button || OK || ||  
 
|-
 
|-
 
| USB hub 1 || OK || || tested with thumb drive, needs performance testing
 
| USB hub 1 || OK || || tested with thumb drive, needs performance testing
Line 40: Line 42:
 
| USB ext1 || OK || ||  
 
| USB ext1 || OK || ||  
 
|-
 
|-
| USB ext1 power switch || || ||
+
| USB ext1 power switch || OK || ||
 
|-
 
|-
 
| USB ext2 || OK || ||  
 
| USB ext2 || OK || ||  
 
|-
 
|-
| USB ext2 power switch || || ||
+
| USB ext2 power switch || OK || ||
 
|-3.6
 
|-3.6
| ASIX ethernet || OK || OK || 84,085,191 bytes in 18.53s = 36 Mbps (limited by external fiber uplink speed)
+
| ASIX ethernet || OK || OK || 96 Mbps measured. Note that MAC address must be generated and assigned (not fixed in hardware ROM)
 
 
note that MAC address must be generated and assigned (not fixed in hardware ROM)
 
 
|-
 
|-
| Gbit ethernet ||  || || Needs PHY driver before it can be tested
+
| Gbit ethernet || OK || OK || Close to 400 Mbps performance achieved in optimized speed tests, with correct PHY tuning parameters. This is apparently close to the theoretical maximum limit per erratum [http://cache.freescale.com/files/32bit/doc/errata/IMX6DQCE.pdf?fsrch=1&sr=5 ERR004512].
 
|-
 
|-
| SDHC4 || || ||
+
| SDHC2 || OK || || CD and RO pins both work.  Can access card.  Haven't tested speeds yet.
 
|-
 
|-
| utility EEPROM || || ||
+
| utility EEPROM || OK || || use 16-bit mode for access. eeprom tools need some tweaking, just wrote a couple bytes and declared success.
 
|-
 
|-
| audio base || || ||
+
| audio base || OK || ||
 
|-
 
|-
| audio power switch || || ||
+
| audio power switch || OK || || Connected to gpio145; requires ECO to strengthen "off" pulldown
 
|-
 
|-
| speakers || || ||
+
| speakers || OK || ||
 
|-
 
|-
| headphone || || ||
+
| headphone || OK || ||
 
|-
 
|-
| analog mic in  || || ||
+
| analog mic in  || OK || || requires DLRCK and ALRCK to be separated; see ECO notes below
 
|-
 
|-
| digital mic || || ||
+
| digital mic || N/A || || dropping feature in favor of using headset mic. Active requests to isolate passive input devices from system for security/privacy concerns.
 
|-
 
|-
| USB keyboard/mouse port || || ||
+
| USB keyboard/mouse port || N/A || || not testing because "it should just work" -- other ports on hub are shown to work and this is a direct pin-out of wires going to low-speed or at best case full-speed devices (so no concerns about high speed signal integrity issues)
 
|-
 
|-
| USB keyboard/mouse power switch || || ||
+
| USB keyboard/mouse power switch || OK || || Both power switches work
 
|-
 
|-
| USB high current (1.5A) charging || || ||
+
| USB high current (1.5A) charging || OK || || charges my note II just fine :)
 
|-
 
|-
| USB OTG || || ||
+
| USB OTG || OK || ||
 
|-
 
|-
| HDMI || || ||
+
| HDMI || OK || || Needs HPD to be inverted.
 
|-
 
|-
| FPGA || || ||
+
| FPGA || OK || || Need to enable clock by running ''devmem2 0x020c8160 w 0x00000920'', check values in PMU_REG_MISC1 under LVDS2_CLK_SEL
 
|-
 
|-
 
| FPGA apoptosis option || || ||
 
| FPGA apoptosis option || || ||
Line 84: Line 84:
 
| on-board USB wifi || || ||
 
| on-board USB wifi || || ||
 
|-
 
|-
| wifi power switch || || ||
+
| wifi power switch || OK || ||
 
|-
 
|-
| PCI-express || || ||
+
| PCI-express || OK || || Tested Atheros wifi card.  Comes up on boot, can do long wifi transfers.
 
|-
 
|-
| PCI-express power switch || || ||
+
| PCI-express power switch || OK || ||
 
|-
 
|-
 
| PCI-express embedded USB || || ||
 
| PCI-express embedded USB || || ||
 
|-
 
|-
| USIM || || ||
+
| USIM || N/A || || not validating, no target hardware to test against
 
|-
 
|-
| USIM power switch || || ||
+
| USIM power switch || N/A || || not validating, no target hardware to test against
 
|-
 
|-
| LCD port || || ||
+
| LCD port || OK || || via eDP to retina display
 
|-
 
|-
| LCD port USB || || ||
+
| LCD port USB || || ||
 
|-
 
|-
| LCD VCC power switch || || ||
+
| LCD VCC power switch || OK || ||  
 
|-
 
|-
| LCD backlight power switch || || ||
+
| LCD backlight power switch || OK || ||  
 
|-
 
|-
| touchscreen || || ||
+
| touchscreen || OK || || coarse validation of I2C connectivity
 
|-
 
|-
| user function button || || ||
+
| user function button || OK || || Connected to IOMUXC_IOMUXC_SW_MUX_CTL_PAD_KEY_ROW4 (GPIO4[15]) and IOMUXC_IOMUXC_SW_MUX_CTL_PAD_KEY_COL4 (GPIO4[14]).  Set gpio110 to output a 1, and read gpio111 to get the button state.
 
|-
 
|-
| uart 3 || || ||
+
| uart 3 || OK || ||
 
|-
 
|-
| uart 4 || || ||
+
| uart 4 || OK || ||
 
|-
 
|-
| accelerometer || || ||
+
| accelerometer || OK || || i2cset -y 0 0x1d 0x16 0x01; i2cdump -y -r 6-8 0 0x1d . seems to accelerometate just fine.
 
|-
 
|-
| SATA || || ||
+
| SATA || OK || || hotswap works
 
|-
 
|-
| SATA power switch || || ||
+
| SATA power switch || OK || || tied to gpio94
 
|-
 
|-
 
| battery interface || || ||
 
| battery interface || || ||
 
|-
 
|-
| boot option headers || || ||
+
| boot option headers || OK || || Was able to load U-Boot from SDHC3, SDHC2, and SATA
 
|-
 
|-
| JTAG || || ||
+
| JTAG || N/A || || not needed, not tested
 
|-
 
|-
| FPGA SPI memory || || ||
+
| FPGA SPI memory || N/A || || refactoring FPGA, likely to eliminate feature
 
|-
 
|-
| FPGA SPINOR memory || || ||
+
| FPGA SPINOR memory || N/A || || refactoring FPGA, probably keeping this feature but re-pinning out
 
|-
 
|-
| FPGA ADC || || ||
+
| FPGA ADC || N/A || || refactoring FPGA, likely to eliminate feature; redundant with touchscreen ADC gpio pins and external high-speed ADC  headers
 
|-
 
|-
| Rapsberry Pi peripheral header || || ||
+
| Rapsberry Pi peripheral header || N/A || || dropping feature on rev 2
 
|}
 
|}
  
Line 139: Line 139:
  
 
===Hardware===
 
===Hardware===
* Inrush current limiting for 3.3V_DELAYED turnon: R38N should be increased to about 30k. Need to verify with experiment turn-on timing margin (i.e. put smaller values in until failure to determine how much margin is available at 30k to ensure consistency across process variation)
+
* Inrush current limiting for 3.3V_DELAYED turnon: R38N should be increased to 10k. C30N increased to 1uF. This causes the ramp time to be about 2-3ms. Ramp is fairly smooth, with a small glitch as it crosses about 2.3V.
 
** SN001 has 33k -- revised to 10k to match other boards
 
** SN001 has 33k -- revised to 10k to match other boards
 
** SN004 has 10k (CPU did not boot at all with 47k)
 
** SN004 has 10k (CPU did not boot at all with 47k)
 
** SN003 has 10k (PMIC does not respond to commands post-boot with 47k)
 
** SN003 has 10k (PMIC does not respond to commands post-boot with 47k)
 
** SN002 has 10k, but has other problems preventing boot
 
** SN002 has 10k, but has other problems preventing boot
 +
 +
* Fix bug where reset button does not work due to FPGA pulling boot fuses to look at SATA instead of SD card. Resolution is to change HSWAPEN to high, which turns off FPGA pull-ups. ECO applied to all 5 boards.
 +
 +
* 1Gbit PHY reset circuit (per Micrel datasheet) interferes with driver timing. The reset rises too slowly, driver checks MII status immediately and never retries dynamically (does a static read-out of MII data). Fix is to remove the local reset circuit. ECO applied to all 5 boards.
 +
 +
* PCIE_PWRON signal wasn't connected. Oops! In next rev, it will go to GPIO_16, pad R1. For now, it will be hard-wired to powered on. Fix done on all 5 boards.
 +
 +
* 1Gbit PHY magnetics termination is incorrect. Remove R14G to improve performance to gigabit-speeds. This was a lucky shot in the dark, however, I don't actually understand what's going on with the center-tap termination, and what the trade-offs are. The link still needs to be deeply characterized for NEXT, FEXT, etc. to verify that it is optimally terminated.
 +
 +
* 1Gbit refclock stability (sourced by the PHY chip) looks poor. Termination seems to be causing some signal amplitude degradation. Doesn't seem to have fundamental jitter, rather the waveform's shape is unstable which causes the crossing to shift as the waveform changes shape. This needs to be investigated further. Shorting across R21G improves signal integrity to the point where the link has both Tx and Rx stability on all 4 characterizeable boards; change done on all 5 boards.
 +
 +
* HDMI HPD sense is inverted. Need to add Q17L, R29L, and swap R28L/R27L to fix the issue. Done to four boards (Jacob's board is unavailable).
 +
 +
* Audio power-off pulldown is too small. Reduced from 100ohms to 10ohms. About 10mA (leading to 1V on power rail) is observed to flow from leakage paths through when 100 ohm resistor is used. Replace R21A with 10 ohm resistor. Done to only to one board (sean's board -- SN has rubbed off)
 +
 +
* Inrush current limit on 5.0V_DELAYED needs adjusting (similar to 3.3V_DELAYED). Too fast turn-on causes charge-sharing to deplete the filter capacitors and causes the regulator to believe a short circuit condition has happened. R29N increased to 10k, C27N increased to 1uF. Turn-on transient is about 2-3ms. Ramp is not monotonic. After voltage passes about 2V, a 'comb' appears in the transient for just under 2ms. This 'comb' is presumably due to slave regulators powering on and drawing current through the turn-on transient.
 +
 +
* Default reset timing from PMIC is insufficient to allow for the various turn-on ramps in the system. Therefore, an APX803 reset monitor must be added, watching the +5V line. This forces reset to be active for ~200ms past +5V stabilization. Tested with a threshold of 4.2V-4.38V; final trip point should be fine in range of 4.2V-4.5V.
 +
 +
* Power-down leakage on main power input rail is sufficiently small that a fault condition on the +5V regulator does not clear until several seconds have passed, and the voltage on the input rail has dropped below about 1V. Add a 2.2k resistor on the router version of the board in shunt with the input rail to bleed input caps in case of fault condition. For battery powered version, must add an actively switched pull-down transistor on the battery board to bleed the caps.
 +
 +
* ALRCK needs to be independent of DLRCK on the ES8328. Merging them together causes the clocks to fight. Cut pin 9 and jumper to LCD_BL_ON (as it is muxed with an audio clock pin). LCD_BL_ON needs to find a new home as a GPIO, i.e. CSI0_DAT14.
 +
 +
====Unknown Issues====
 +
Or, the journal of unreproducible artifacts.
 +
 +
* C21N (22uF, 10V X5R 20%) on SN004 developed a spontaneous short-circuit. Device was in idle operation at the time of the incident, with no operator present to observe the failure instant. C21N showed signs of asymmetric internal delamination of internal dielectric layers; top surface was anisotropcially cracked. C21N internal impedance near zero (< 0.02 ohms). NCP3020B did its job and went into self-protect mode, circuitry was not permanently damaged. C21N replaced with 10uF, 16V X5R capacitor and device has resumed normal operation. Working hypotheses for cause of crack are: (1) mechanical damage due to proximity to L10N, and vibration (either induced through handling or acoustic parasitic coupling of eddy currents) causing the capacitor to crack/delaminate; (2) thermal expansion mismatch of substrate vs capacitor causing stress to build up and crack the capacitor; (3) overstress of capacitor (excessive current, overshoot) causing failure of internal dielectric layers, possibly worsened through thermal internal stresses; (4) material defect in capacitor.
 +
 +
 +
====DDR3 Bringup====
 +
* See [[novena ddr3 notes]] for ddr3 bringup notes.
  
 
===ECO list===
 
===ECO list===
 
* R38N change to 10k, 1% 0402 (resolve inrush current limit issue)
 
* R38N change to 10k, 1% 0402 (resolve inrush current limit issue)
 
* R12F to DNP, R13F to 4.7k, 1% 0402 (resolve boot fuse issue with FPGA pull-ups)
 
* R12F to DNP, R13F to 4.7k, 1% 0402 (resolve boot fuse issue with FPGA pull-ups)
 +
* remove C32G, D12G, R20G, D11G; short across D12G with wire jumper or 0805 resistor, 0 ohm (resolve gbit ethernet reset issue)
 +
* remove R11X, tie gate of Q10X to P3.3V_DELAYED (off of pin 2 of Q11X) (resolve PCIE power on issue)
 +
* remove R14G (possibly resolve magnetics termination issue)
 +
* replace R21G with 0-ohm resistor or shunt (partially resolve Gbit refclock stability issue)
 +
* move R28L to R27L (HDMI HPD swap)
 +
* add Q17L (HDMI HPD swap)
 +
* add R29L (HDMI HPD swap)
 +
* change R21A from 100 ohms to 10 ohms
 +
* C30N changed to 1uF
 +
* R29N changed to 10k, 1%
 +
* C27N changed to 1uF
 +
* APX803-44SAG-7 added, monitoring the P5.0V line, and pulling down the RESETBMCU line.
 +
* probably all instances of RC timing for power switches need to be improved (see 10k/1uF patch)
 +
* add 2.2k resistor in shunt with input capacitor network (BATT_PWR to ground)
 +
* Cut ALRCK trace to U11A, and jumper pin 9 of U11A to LCD_BL_ON
 +
* Rewire LCD_BL_ON to a new GPIO on the i.MX6 (final GPIO TBD)
 +
* Connect PICE_PWRON to i.MX6 (final GPIO TBD)
 +
 +
Summarized: [[Novena EVT to DVT changes]]
  
 
===Software===
 
===Software===
Line 153: Line 203:
 
* DDR3: need to come up with alternate poke files for different SO-DIMM types
 
* DDR3: need to come up with alternate poke files for different SO-DIMM types
 
* DDR3: need to figure out how to configure u-boot to recognize greater amounts of DRAM
 
* DDR3: need to figure out how to configure u-boot to recognize greater amounts of DRAM
* MMC: USDHC3 has to have the CD check return 1 at all times. '''Patch not yet checked into trunk''' - todo bunnie
+
* MMC: USDHC3 has to have the CD check return 1 at all times.
  
 
====Linux====
 
====Linux====
* MMC: device tree novena.dts descriptor edited to note that USDHC3 port is non-removable in order to enable boot '''Patch not yet checked into trunk''' - todo bunnie
+
* MMC: device tree novena.dts descriptor edited to note that USDHC3 port is non-removable in order to enable boot  
 
* Power: Driver for power currently assumes fixed regulators, an incorrect assumption. This needs to be changed to use the PFUZE PMIC. Currently, no drivers exist in the source tree (based off of the sabrelite). Recommend using Sabre board for smart devices as the base image instead of sabrelite - todo xobs
 
* Power: Driver for power currently assumes fixed regulators, an incorrect assumption. This needs to be changed to use the PFUZE PMIC. Currently, no drivers exist in the source tree (based off of the sabrelite). Recommend using Sabre board for smart devices as the base image instead of sabrelite - todo xobs
 
* USB/Power: PMIC does not turn on the USB VBUS by default, which causes internal root hub to fail. To fix this:
 
* USB/Power: PMIC does not turn on the USB VBUS by default, which causes internal root hub to fail. To fix this:
**add i2c2 to device tree (added to novena.dts but '''not yet checked into trunk''' - todo bunnie)
+
**add i2c2 to device tree (added to novena.dts)
 +
**run this to set CPU voltage to 1.25V rather than 1.375V:
 +
i2cset -y 1 0x08 0x20 38
 
**run this command to turn on the boost regulator:
 
**run this command to turn on the boost regulator:
  i2cset 1 0x08 0x66 0x48
+
devmem2 0x20e03a0 w 0x1b848  # set i2c bus speeds to something sane
 +
devmem2 0x20e03a4 w 0x1b848
 +
  i2cset -y 1 0x08 0x66 0x48   # actually set the regulator using i2c
 
* once USB is on, I can verify/see USB drives in both USB ports. See this:
 
* once USB is on, I can verify/see USB drives in both USB ports. See this:
 
<pre>
 
<pre>
Line 179: Line 233:
 
ip link set dev eth1 up
 
ip link set dev eth1 up
 
</pre>
 
</pre>
* i2c bus 3 (/dev/i2c-2) seems to be mis-configured, can't see the utility EEPROM there. Needs debugging, maybe dts is wrong? - todo bunnie
+
 
 +
====Gbit Ethernet Characterization====
 +
* Performance results benchmarking 1000Gbit ethernet (run on SN004):
 +
<pre>
 +
bunnie@crashbox:/var/www/bunnie$ iperf -s
 +
------------------------------------------------------------
 +
Server listening on TCP port 5001
 +
TCP window size: 85.3 KByte (default)
 +
------------------------------------------------------------
 +
[  4] local 10.0.39.142 port 5001 connected with 10.0.39.241 port 55445
 +
------------------------------------------------------------
 +
Client connecting to 10.0.39.241, TCP port 5001
 +
TCP window size: 47.0 KByte (default)
 +
------------------------------------------------------------
 +
[  6] local 10.0.39.142 port 38307 connected with 10.0.39.241 port 5001
 +
[ ID] Interval      Transfer    Bandwidth
 +
[  6]  0.0-10.0 sec  280 MBytes  234 Mbits/sec
 +
[  4]  0.0-11.0 sec  6.01 MBytes  4.57 Mbits/sec
 +
[  5] local 10.0.39.142 port 5001 connected with 10.0.39.241 port 55446
 +
------------------------------------------------------------
 +
Client connecting to 10.0.39.241, TCP port 5001
 +
TCP window size: 47.0 KByte (default)
 +
------------------------------------------------------------
 +
[  6] local 10.0.39.142 port 38308 connected with 10.0.39.241 port 5001
 +
[  6]  0.0-10.0 sec  286 MBytes  239 Mbits/sec
 +
[  5]  0.0-10.1 sec  3.77 MBytes  3.12 Mbits/sec
 +
[  4] local 10.0.39.142 port 5001 connected with 10.0.39.241 port 55447
 +
------------------------------------------------------------
 +
Client connecting to 10.0.39.241, TCP port 5001
 +
TCP window size: 47.0 KByte (default)
 +
------------------------------------------------------------
 +
[  6] local 10.0.39.142 port 38309 connected with 10.0.39.241 port 5001
 +
[  6]  0.0-10.0 sec  286 MBytes  240 Mbits/sec
 +
[  4]  0.0-10.1 sec  3.48 MBytes  2.88 Mbits/sec
 +
[  5] local 10.0.39.142 port 5001 connected with 10.0.39.241 port 55448
 +
 
 +
from running
 +
 
 +
root@novena:~# iperf -c 10.0.39.142 -d
 +
 
 +
three times over
 +
</pre>
 +
Server is a Supermicro X9SCL with a Xeon E3-1230 CPU and 8GB ram, running off of an SSD using Ubuntu 12.04.1 LTS.
 +
 
 +
Both client and server are plugged into a TP-Link 5-port gigabit desktop switch, TL-SG1005D.
 +
 
 +
Not sure why the asymmetry in the result. Seems to be real, the upload speed from Novena is pretty slow.
 +
 
 +
Note that the fastest speed I've ever seen on my network is about 240 Mbps, even between "big, mature" computers. Maybe it's the $20 switches that I buy.
 +
 
 +
Poor Tx performance is board-specific, indicating a hardware issue on Tx path.
 +
 
 +
running on SN 001 gives
 +
 
 +
<pre>
 +
------------------------------------------------------------
 +
Client connecting to 10.0.239.142, TCP port 5001
 +
TCP window size: 20.7 KByte (default)
 +
------------------------------------------------------------
 +
[  5] local 10.0.239.241 port 54666 connected with 10.0.239.142 port 5001
 +
[  4] local 10.0.239.241 port 5001 connected with 10.0.239.142 port 38386
 +
[ ID] Interval      Transfer    Bandwidth
 +
[  4]  0.0-10.0 sec    230 MBytes    193 Mbits/sec
 +
[  5]  0.0-10.2 sec  55.3 MBytes  45.5 Mbits/sec
 +
</pre>
 +
 
 +
One possible culprit is the TX clock signal integrity (on the RGMII side) looks pretty bad. The Tx signal jitter is also correspondingly poor. This could be because of bad PLL stability inside the iMX6, or could be because of ground bounce, power supply unsteadiness, etc.
 +
 
 +
Note RGMII_REF_CLK also has likewise unstability. Odd. Measuring RGMII_REF_CLK causes bitrate of Tx side to drop dramatically. Seems like we have some correlation here.
 +
 
 +
Shorting out the termination on RGMII_REF_CLK vastly improves Tx performance:
 +
 
 +
<pre>
 +
------------------------------------------------------------
 +
Client connecting to 10.0.239.142, TCP port 5001
 +
TCP window size: 20.7 KByte (default)
 +
------------------------------------------------------------
 +
[  4] local 10.0.239.241 port 44027 connected with 10.0.239.142 port 5001
 +
[  5] local 10.0.239.241 port 5001 connected with 10.0.239.142 port 38477
 +
[ ID] Interval      Transfer    Bandwidth
 +
[  4]  0.0-10.0 sec    184 MBytes    154 Mbits/sec
 +
[  5]  0.0-10.0 sec    157 MBytes    131 Mbits/sec
 +
------------------------------------------------------------
 +
</pre>
 +
 
 +
However, the signal still shows some jitter. There are probably a couple of things going on inside the PHY that I'm not understanding, so it warrants further investigation with a high speed scope. As a stop-gap, it may be acceptable to short out the termination resistor R21G as it seems the drive strength of the KSZ9021RN isn't good enough to power through it. Currently, only SN001 has the fix.
 +
 
 +
SN003 with fix:
 +
<pre>
 +
------------------------------------------------------------
 +
Client connecting to 10.0.239.142, TCP port 5001
 +
TCP window size: 45.5 KByte (default)
 +
------------------------------------------------------------
 +
[  5] local 10.0.239.241 port 50859 connected with 10.0.239.142 port 5001
 +
[  4] local 10.0.239.241 port 5001 connected with 10.0.239.142 port 38485
 +
[ ID] Interval      Transfer    Bandwidth
 +
[  5]  0.0-10.0 sec    180 MBytes    151 Mbits/sec
 +
[  4]  0.0-10.0 sec    160 MBytes    134 Mbits/sec
 +
</pre>
 +
 
 +
SN005 with fix:
 +
<pre>
 +
Client connecting to 10.0.239.142, TCP port 5001
 +
TCP window size: 53.7 KByte (default)
 +
------------------------------------------------------------
 +
[  5] local 10.0.239.241 port 51507 connected with 10.0.239.142 port 5001
 +
[  4] local 10.0.239.241 port 5001 connected with 10.0.239.142 port 38488
 +
[ ID] Interval      Transfer    Bandwidth
 +
[  5]  0.0-10.0 sec    179 MBytes    150 Mbits/sec
 +
[  4]  0.0-10.0 sec    159 MBytes    133 Mbits/sec
 +
</pre>
 +
 
 +
SN004 with fix:
 +
<pre>
 +
Client connecting to 10.0.239.142, TCP port 5001
 +
TCP window size: 53.7 KByte (default)
 +
------------------------------------------------------------
 +
[  5] local 10.0.239.241 port 57324 connected with 10.0.239.142 port 5001
 +
[  4] local 10.0.239.241 port 5001 connected with 10.0.239.142 port 38492
 +
[ ID] Interval      Transfer    Bandwidth
 +
[  5]  0.0-10.0 sec    176 MBytes    148 Mbits/sec
 +
[  4]  0.0-10.0 sec    161 MBytes    135 Mbits/sec
 +
</pre>
 +
 
 +
As a stop gap, it's ok, but the signal integrity of the refclk still looks like dogmeat.
 +
 
 +
==Software notes==
 +
 
 +
===GPIO management===
 +
GPIO pins are exposed in /sys/class/gpio/.  To convert a pin from a flat GPIO referenced in the manual to a GPIO pin referenced in /sys/class/gpio/, use the following formula:
 +
32*(bank-1)+pin
 +
For example, the FPGA RESET pin is connected to GPIO5[7].  Using the formula above, we get 32*(5-1)+7 = gpio135.
 +
 
 +
To enable the pin for use, echo the pin number to /sys/class/gpio/export.  For example:
 +
echo 135 > /sys/class/gpio/export
 +
 
 +
A directory will appear with the pin name, in our case "gpio135".  To check the GPIO level, read the "value" file.  For example:
 +
cat /sys/class/gpio/gpio135/value
 +
 
 +
To turn the GPIO into an output, set the "direction" to "out".  For example:
 +
echo out > /sys/class/gpio/gpio135/direction
 +
 
 +
Once a GPIO has been converted to an output, the "value" file becomes writeable.
 +
 
 +
===Writing to SPI===
 +
SPI attempts to fill the buffer with the entirety of your write().  By default, the size of writes exceeds the SPI buffer, causing no data to get written.  Use the "bs" argument to dd in order to get around this. To send an FPGA firmware out the port, run a command similar to the following:
 +
dd if=/home/root/novena_fpga.bit of=/dev/spidev32766.0 bs=128
 +
 
 +
===FPGA===
 +
A compact script to configure the FPGA and start the clock to the FPGA at CPU native crystal frequency (24 MHz):
 +
<pre>
 +
echo "Setting export of reset pin"
 +
echo 135 > /sys/class/gpio/export
 +
echo "setting reset pin to out"
 +
echo out > /sys/class/gpio/gpio135/direction
 +
echo "flipping reset"                     
 +
echo 0 > /sys/class/gpio/gpio135/value
 +
echo 1 > /sys/class/gpio/gpio135/value
 +
                                     
 +
echo "configuring FPGA"             
 +
                                     
 +
dd if=/home/root/novena_fpga.bit of=/dev/spidev32766.0 bs=128
 +
                                                           
 +
echo "turning on clock to FPGA"                             
 +
devmem2 0x020c8160 w 0x00000D2B     
 +
</pre>
 +
 
 +
The FPGA is connected to the CPU via EIM. To configure EIM for use, some registers have to be poked.
 +
 
 +
This command set below configures EIM for 16-bit synchronous.
 +
<pre>
 +
devmem2 0x020c4080 w 0xfc3
 +
devmem2 0x021b8000 w 0x403104b1
 +
devmem2 0x021b8008 w 0x04000000
 +
devmem2 0x021b800c w 0x00000008
 +
devmem2 0x021b8010 w 0x04000000
 +
</pre>
 +
 
 +
This command set below configures EIM for 16-bit asynchronous.
 +
<pre>
 +
devmem2 0x020c4080 w 0xfc3
 +
devmem2 0x021b8000 w 0x403104b1
 +
devmem2 0x021b8008 w 0x0b010000
 +
devmem2 0x021b800c w 0x00000008
 +
devmem2 0x021b8010 w 0x00040040
 +
</pre>
 +
 
 +
<pre>
 +
devmem2 0x020c4080 w 0xfc3
 +
 
 +
# EIM_CS0GCR1  0101 0 001 1 100 0 001 00 00 0 000 1 0 1 1 1 0 0 1
 +
# 0101 0100 1100 0001 0000 0000 1011 1001
 +
devmem2 0x21b8000 w 0x54c100b9
 +
# EIM_CS0GCR2  0000 0000 0000 0000 0001 00 0 0 0000 00 01
 +
# 0000 0000 0000 0000 0001 0000 0000 0001
 +
devmem2 0x21b8004 w 0x1001
 +
 
 +
# EIM_CS0RCR1  00 000001 0 001 1 000 0 010 0 010 0 010 0 010
 +
# 0000 0001 0001 1000 0010 0010 0010 0010
 +
devmem2 0x21b8008 w 0x01182222
 +
# EIM_CS0RCR1  00 000100 0 010 0 000 0 000 0 000 0 000 0 000
 +
# 0000 0100 0010 0000 0000 0000 0000 0000
 +
devmem2 0x21b8008 w 0x04000000
 +
 
 +
 
 +
# EIM_CS0RCR2  0000 0000 0 000 00 01 0 001 0 001
 +
# 0000 0000 0000 0001 0001 0001
 +
devmem2 0x21b800c w 0x00000111
 +
 
 +
# EIM_CS0WCR1  0100 0100 000 000 000 000 000 000 000 000
 +
devmem2 0x21b8010 w 0x44000000
 +
 
 +
 
 +
# EIM_WCR  0000 0000 0000 0000 0000 0 00 1 00 0 0 0 00 1
 +
# 0000 0000 0000 0000 0000 0001 0000 0001
 +
devmem2 0x21b8090 w 0x101
 +
 
 +
# EIM_WIAR 0000 0000 0000 0000 0000 0000 000 1 0 0 0 0
 +
devmem2 0x21b8094 w 0x10
 +
</pre>
 +
 
 +
EIM CS0 starts at 0x08000000.
 +
 
 +
 
 +
EIM DA0-15 pad configuration:
 +
<pre>
 +
devmem2 0x020e0114 w 0x00000000
 +
devmem2 0x020e0428 w 0x0000b0b1
 +
devmem2 0x020e0118 w 0x00000000
 +
devmem2 0x020e042c w 0x0000b0b1
 +
devmem2 0x020e011c w 0x00000000
 +
devmem2 0x020e0430 w 0x0000b0b1
 +
devmem2 0x020e0120 w 0x00000000
 +
devmem2 0x020e0434 w 0x0000b0b1
 +
devmem2 0x020e0124 w 0x00000000
 +
devmem2 0x020e0438 w 0x0000b0b1
 +
devmem2 0x020e0128 w 0x00000000
 +
devmem2 0x020e043c w 0x0000b0b1
 +
devmem2 0x020e012c w 0x00000000
 +
devmem2 0x020e0440 w 0x0000b0b1
 +
devmem2 0x020e0130 w 0x00000000
 +
devmem2 0x020e0444 w 0x0000b0b1
 +
devmem2 0x020e0134 w 0x00000000
 +
devmem2 0x020e0448 w 0x0000b0b1
 +
devmem2 0x020e0138 w 0x00000000
 +
devmem2 0x020e044c w 0x0000b0b1
 +
devmem2 0x020e013c w 0x00000000
 +
devmem2 0x020e0450 w 0x0000b0b1
 +
devmem2 0x020e0140 w 0x00000000
 +
devmem2 0x020e0454 w 0x0000b0b1
 +
devmem2 0x020e0144 w 0x00000000
 +
devmem2 0x020e0458 w 0x0000b0b1
 +
devmem2 0x020e0148 w 0x00000000
 +
devmem2 0x020e045c w 0x0000b0b1
 +
devmem2 0x020e014c w 0x00000000
 +
devmem2 0x020e0460 w 0x0000b0b1
 +
devmem2 0x020e0150 w 0x00000000
 +
devmem2 0x020e0464 w 0x0000b0b1
 +
</pre>

Latest revision as of 07:48, 12 July 2013

Bringup status by subsystem

Bringup status by subsystem
Subsystem basic status extended status notes
Sleep/suspend Need CI testbench for sleep/suspend to verify extended status
PMIC base OK voltages nominal, but on high side for CPU (1.35V) -- need PMIC DVFS driver to dial it back
PMIC advanced needs driver for further testing
RTC backup battery
RTC OK REV 7-day run (619337 seconds to be precise) shows a drift of 52 seconds too fast = +84PPM stability referenced to Lenovo T61 RTC. New xtal being chosen for V2 due to obsoletion of current crystal, so will have to retest anyways. Run on SN007.
debug console OK OK
SDHC3 (microSD boot) OK
SDHC3 power switch OK works as-designed; no detail test of power-off since that kills the root partition, but the card does reset cleanly between reboots, which is the primary purpose of the switch.
DDR3 base OK OK functions at 1066 MT/s, 4 GB 1 and 2 rank configured, using http://memtester.sourcearchive.com/documentation/4.0.8/files.html for testing in userspace
DDR3 extended OK OK tested with 1 and 2 rank DIMMs, 1-4 GB configurations. Boards repeatedly surviving 200-hour stress testing runs.
I2C1 (SMB) OK can read out DDR3 I2C config using i2cdump
I2C2 OK can read out accelerometer, PMIC bits using i2cdump
I2C3 OK Can talk to audio codec
reset button OK
USB hub 1 OK tested with thumb drive, needs performance testing
USB hub 2 OK ditto
USB ext1 OK
USB ext1 power switch OK
USB ext2 OK
USB ext2 power switch OK
ASIX ethernet OK OK 96 Mbps measured. Note that MAC address must be generated and assigned (not fixed in hardware ROM)
Gbit ethernet OK OK Close to 400 Mbps performance achieved in optimized speed tests, with correct PHY tuning parameters. This is apparently close to the theoretical maximum limit per erratum ERR004512.
SDHC2 OK CD and RO pins both work. Can access card. Haven't tested speeds yet.
utility EEPROM OK use 16-bit mode for access. eeprom tools need some tweaking, just wrote a couple bytes and declared success.
audio base OK
audio power switch OK Connected to gpio145; requires ECO to strengthen "off" pulldown
speakers OK
headphone OK
analog mic in OK requires DLRCK and ALRCK to be separated; see ECO notes below
digital mic N/A dropping feature in favor of using headset mic. Active requests to isolate passive input devices from system for security/privacy concerns.
USB keyboard/mouse port N/A not testing because "it should just work" -- other ports on hub are shown to work and this is a direct pin-out of wires going to low-speed or at best case full-speed devices (so no concerns about high speed signal integrity issues)
USB keyboard/mouse power switch OK Both power switches work
USB high current (1.5A) charging OK charges my note II just fine :)
USB OTG OK
HDMI OK Needs HPD to be inverted.
FPGA OK Need to enable clock by running devmem2 0x020c8160 w 0x00000920, check values in PMU_REG_MISC1 under LVDS2_CLK_SEL
FPGA apoptosis option
on-board USB wifi
wifi power switch OK
PCI-express OK Tested Atheros wifi card. Comes up on boot, can do long wifi transfers.
PCI-express power switch OK
PCI-express embedded USB
USIM N/A not validating, no target hardware to test against
USIM power switch N/A not validating, no target hardware to test against
LCD port OK via eDP to retina display
LCD port USB
LCD VCC power switch OK
LCD backlight power switch OK
touchscreen OK coarse validation of I2C connectivity
user function button OK Connected to IOMUXC_IOMUXC_SW_MUX_CTL_PAD_KEY_ROW4 (GPIO4[15]) and IOMUXC_IOMUXC_SW_MUX_CTL_PAD_KEY_COL4 (GPIO4[14]). Set gpio110 to output a 1, and read gpio111 to get the button state.
uart 3 OK
uart 4 OK
accelerometer OK i2cset -y 0 0x1d 0x16 0x01; i2cdump -y -r 6-8 0 0x1d . seems to accelerometate just fine.
SATA OK hotswap works
SATA power switch OK tied to gpio94
battery interface
boot option headers OK Was able to load U-Boot from SDHC3, SDHC2, and SATA
JTAG N/A not needed, not tested
FPGA SPI memory N/A refactoring FPGA, likely to eliminate feature
FPGA SPINOR memory N/A refactoring FPGA, probably keeping this feature but re-pinning out
FPGA ADC N/A refactoring FPGA, likely to eliminate feature; redundant with touchscreen ADC gpio pins and external high-speed ADC headers
Rapsberry Pi peripheral header N/A dropping feature on rev 2

Power consumption notes

System at idle with no PM code running and 1GB standard RAM consumes 11.3V, 0.34A (measured at input regulator cap)

Known issues

Hardware

  • Inrush current limiting for 3.3V_DELAYED turnon: R38N should be increased to 10k. C30N increased to 1uF. This causes the ramp time to be about 2-3ms. Ramp is fairly smooth, with a small glitch as it crosses about 2.3V.
    • SN001 has 33k -- revised to 10k to match other boards
    • SN004 has 10k (CPU did not boot at all with 47k)
    • SN003 has 10k (PMIC does not respond to commands post-boot with 47k)
    • SN002 has 10k, but has other problems preventing boot
  • Fix bug where reset button does not work due to FPGA pulling boot fuses to look at SATA instead of SD card. Resolution is to change HSWAPEN to high, which turns off FPGA pull-ups. ECO applied to all 5 boards.
  • 1Gbit PHY reset circuit (per Micrel datasheet) interferes with driver timing. The reset rises too slowly, driver checks MII status immediately and never retries dynamically (does a static read-out of MII data). Fix is to remove the local reset circuit. ECO applied to all 5 boards.
  • PCIE_PWRON signal wasn't connected. Oops! In next rev, it will go to GPIO_16, pad R1. For now, it will be hard-wired to powered on. Fix done on all 5 boards.
  • 1Gbit PHY magnetics termination is incorrect. Remove R14G to improve performance to gigabit-speeds. This was a lucky shot in the dark, however, I don't actually understand what's going on with the center-tap termination, and what the trade-offs are. The link still needs to be deeply characterized for NEXT, FEXT, etc. to verify that it is optimally terminated.
  • 1Gbit refclock stability (sourced by the PHY chip) looks poor. Termination seems to be causing some signal amplitude degradation. Doesn't seem to have fundamental jitter, rather the waveform's shape is unstable which causes the crossing to shift as the waveform changes shape. This needs to be investigated further. Shorting across R21G improves signal integrity to the point where the link has both Tx and Rx stability on all 4 characterizeable boards; change done on all 5 boards.
  • HDMI HPD sense is inverted. Need to add Q17L, R29L, and swap R28L/R27L to fix the issue. Done to four boards (Jacob's board is unavailable).
  • Audio power-off pulldown is too small. Reduced from 100ohms to 10ohms. About 10mA (leading to 1V on power rail) is observed to flow from leakage paths through when 100 ohm resistor is used. Replace R21A with 10 ohm resistor. Done to only to one board (sean's board -- SN has rubbed off)
  • Inrush current limit on 5.0V_DELAYED needs adjusting (similar to 3.3V_DELAYED). Too fast turn-on causes charge-sharing to deplete the filter capacitors and causes the regulator to believe a short circuit condition has happened. R29N increased to 10k, C27N increased to 1uF. Turn-on transient is about 2-3ms. Ramp is not monotonic. After voltage passes about 2V, a 'comb' appears in the transient for just under 2ms. This 'comb' is presumably due to slave regulators powering on and drawing current through the turn-on transient.
  • Default reset timing from PMIC is insufficient to allow for the various turn-on ramps in the system. Therefore, an APX803 reset monitor must be added, watching the +5V line. This forces reset to be active for ~200ms past +5V stabilization. Tested with a threshold of 4.2V-4.38V; final trip point should be fine in range of 4.2V-4.5V.
  • Power-down leakage on main power input rail is sufficiently small that a fault condition on the +5V regulator does not clear until several seconds have passed, and the voltage on the input rail has dropped below about 1V. Add a 2.2k resistor on the router version of the board in shunt with the input rail to bleed input caps in case of fault condition. For battery powered version, must add an actively switched pull-down transistor on the battery board to bleed the caps.
  • ALRCK needs to be independent of DLRCK on the ES8328. Merging them together causes the clocks to fight. Cut pin 9 and jumper to LCD_BL_ON (as it is muxed with an audio clock pin). LCD_BL_ON needs to find a new home as a GPIO, i.e. CSI0_DAT14.

Unknown Issues

Or, the journal of unreproducible artifacts.

  • C21N (22uF, 10V X5R 20%) on SN004 developed a spontaneous short-circuit. Device was in idle operation at the time of the incident, with no operator present to observe the failure instant. C21N showed signs of asymmetric internal delamination of internal dielectric layers; top surface was anisotropcially cracked. C21N internal impedance near zero (< 0.02 ohms). NCP3020B did its job and went into self-protect mode, circuitry was not permanently damaged. C21N replaced with 10uF, 16V X5R capacitor and device has resumed normal operation. Working hypotheses for cause of crack are: (1) mechanical damage due to proximity to L10N, and vibration (either induced through handling or acoustic parasitic coupling of eddy currents) causing the capacitor to crack/delaminate; (2) thermal expansion mismatch of substrate vs capacitor causing stress to build up and crack the capacitor; (3) overstress of capacitor (excessive current, overshoot) causing failure of internal dielectric layers, possibly worsened through thermal internal stresses; (4) material defect in capacitor.


DDR3 Bringup

ECO list

  • R38N change to 10k, 1% 0402 (resolve inrush current limit issue)
  • R12F to DNP, R13F to 4.7k, 1% 0402 (resolve boot fuse issue with FPGA pull-ups)
  • remove C32G, D12G, R20G, D11G; short across D12G with wire jumper or 0805 resistor, 0 ohm (resolve gbit ethernet reset issue)
  • remove R11X, tie gate of Q10X to P3.3V_DELAYED (off of pin 2 of Q11X) (resolve PCIE power on issue)
  • remove R14G (possibly resolve magnetics termination issue)
  • replace R21G with 0-ohm resistor or shunt (partially resolve Gbit refclock stability issue)
  • move R28L to R27L (HDMI HPD swap)
  • add Q17L (HDMI HPD swap)
  • add R29L (HDMI HPD swap)
  • change R21A from 100 ohms to 10 ohms
  • C30N changed to 1uF
  • R29N changed to 10k, 1%
  • C27N changed to 1uF
  • APX803-44SAG-7 added, monitoring the P5.0V line, and pulling down the RESETBMCU line.
  • probably all instances of RC timing for power switches need to be improved (see 10k/1uF patch)
  • add 2.2k resistor in shunt with input capacitor network (BATT_PWR to ground)
  • Cut ALRCK trace to U11A, and jumper pin 9 of U11A to LCD_BL_ON
  • Rewire LCD_BL_ON to a new GPIO on the i.MX6 (final GPIO TBD)
  • Connect PICE_PWRON to i.MX6 (final GPIO TBD)

Summarized: Novena EVT to DVT changes

Software

U-boot

  • DDR3: need to come up with alternate poke files for different SO-DIMM types
  • DDR3: need to figure out how to configure u-boot to recognize greater amounts of DRAM
  • MMC: USDHC3 has to have the CD check return 1 at all times.

Linux

  • MMC: device tree novena.dts descriptor edited to note that USDHC3 port is non-removable in order to enable boot
  • Power: Driver for power currently assumes fixed regulators, an incorrect assumption. This needs to be changed to use the PFUZE PMIC. Currently, no drivers exist in the source tree (based off of the sabrelite). Recommend using Sabre board for smart devices as the base image instead of sabrelite - todo xobs
  • USB/Power: PMIC does not turn on the USB VBUS by default, which causes internal root hub to fail. To fix this:
    • add i2c2 to device tree (added to novena.dts)
    • run this to set CPU voltage to 1.25V rather than 1.375V:
i2cset -y 1 0x08 0x20 38
    • run this command to turn on the boost regulator:
devmem2 0x20e03a0 w 0x1b848  # set i2c bus speeds to something sane
devmem2 0x20e03a4 w 0x1b848 
i2cset -y 1 0x08 0x66 0x48   # actually set the regulator using i2c
  • once USB is on, I can verify/see USB drives in both USB ports. See this:
root@novena:~# lsusb 
Bus 002 Device 002: ID 05e3:0614 Genesys Logic, Inc. 
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 007: ID 0b95:772b ASIX Electronics Corp. 
Bus 002 Device 004: ID 05e3:0614 Genesys Logic, Inc. 
Bus 002 Device 006: ID 058f:6387 Alcor Micro Corp. Transcend JetFlash Flash Drive
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  • ASIX driver module isn't built into current image. Need to figure out how to turn that on.
    • Configuring ASIX:
ip link set dev eth1 down
ip link set dev eth1 address de:ad:fe:ed:00:01
ip link set dev eth1 up

Gbit Ethernet Characterization

  • Performance results benchmarking 1000Gbit ethernet (run on SN004):
bunnie@crashbox:/var/www/bunnie$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 10.0.39.142 port 5001 connected with 10.0.39.241 port 55445
------------------------------------------------------------
Client connecting to 10.0.39.241, TCP port 5001
TCP window size: 47.0 KByte (default)
------------------------------------------------------------
[  6] local 10.0.39.142 port 38307 connected with 10.0.39.241 port 5001
[ ID] Interval       Transfer     Bandwidth
[  6]  0.0-10.0 sec   280 MBytes   234 Mbits/sec
[  4]  0.0-11.0 sec  6.01 MBytes  4.57 Mbits/sec
[  5] local 10.0.39.142 port 5001 connected with 10.0.39.241 port 55446
------------------------------------------------------------
Client connecting to 10.0.39.241, TCP port 5001
TCP window size: 47.0 KByte (default)
------------------------------------------------------------
[  6] local 10.0.39.142 port 38308 connected with 10.0.39.241 port 5001
[  6]  0.0-10.0 sec   286 MBytes   239 Mbits/sec
[  5]  0.0-10.1 sec  3.77 MBytes  3.12 Mbits/sec
[  4] local 10.0.39.142 port 5001 connected with 10.0.39.241 port 55447
------------------------------------------------------------
Client connecting to 10.0.39.241, TCP port 5001
TCP window size: 47.0 KByte (default)
------------------------------------------------------------
[  6] local 10.0.39.142 port 38309 connected with 10.0.39.241 port 5001
[  6]  0.0-10.0 sec   286 MBytes   240 Mbits/sec
[  4]  0.0-10.1 sec  3.48 MBytes  2.88 Mbits/sec
[  5] local 10.0.39.142 port 5001 connected with 10.0.39.241 port 55448

from running

root@novena:~# iperf -c 10.0.39.142 -d

three times over

Server is a Supermicro X9SCL with a Xeon E3-1230 CPU and 8GB ram, running off of an SSD using Ubuntu 12.04.1 LTS.

Both client and server are plugged into a TP-Link 5-port gigabit desktop switch, TL-SG1005D.

Not sure why the asymmetry in the result. Seems to be real, the upload speed from Novena is pretty slow.

Note that the fastest speed I've ever seen on my network is about 240 Mbps, even between "big, mature" computers. Maybe it's the $20 switches that I buy.

Poor Tx performance is board-specific, indicating a hardware issue on Tx path.

running on SN 001 gives

------------------------------------------------------------
Client connecting to 10.0.239.142, TCP port 5001
TCP window size: 20.7 KByte (default)
------------------------------------------------------------
[  5] local 10.0.239.241 port 54666 connected with 10.0.239.142 port 5001
[  4] local 10.0.239.241 port 5001 connected with 10.0.239.142 port 38386
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec    230 MBytes    193 Mbits/sec
[  5]  0.0-10.2 sec  55.3 MBytes  45.5 Mbits/sec

One possible culprit is the TX clock signal integrity (on the RGMII side) looks pretty bad. The Tx signal jitter is also correspondingly poor. This could be because of bad PLL stability inside the iMX6, or could be because of ground bounce, power supply unsteadiness, etc.

Note RGMII_REF_CLK also has likewise unstability. Odd. Measuring RGMII_REF_CLK causes bitrate of Tx side to drop dramatically. Seems like we have some correlation here.

Shorting out the termination on RGMII_REF_CLK vastly improves Tx performance:

------------------------------------------------------------
Client connecting to 10.0.239.142, TCP port 5001
TCP window size: 20.7 KByte (default)
------------------------------------------------------------
[  4] local 10.0.239.241 port 44027 connected with 10.0.239.142 port 5001
[  5] local 10.0.239.241 port 5001 connected with 10.0.239.142 port 38477
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec    184 MBytes    154 Mbits/sec
[  5]  0.0-10.0 sec    157 MBytes    131 Mbits/sec
------------------------------------------------------------

However, the signal still shows some jitter. There are probably a couple of things going on inside the PHY that I'm not understanding, so it warrants further investigation with a high speed scope. As a stop-gap, it may be acceptable to short out the termination resistor R21G as it seems the drive strength of the KSZ9021RN isn't good enough to power through it. Currently, only SN001 has the fix.

SN003 with fix:

------------------------------------------------------------
Client connecting to 10.0.239.142, TCP port 5001
TCP window size: 45.5 KByte (default)
------------------------------------------------------------
[  5] local 10.0.239.241 port 50859 connected with 10.0.239.142 port 5001
[  4] local 10.0.239.241 port 5001 connected with 10.0.239.142 port 38485
[ ID] Interval       Transfer     Bandwidth
[  5]  0.0-10.0 sec    180 MBytes    151 Mbits/sec
[  4]  0.0-10.0 sec    160 MBytes    134 Mbits/sec

SN005 with fix:

Client connecting to 10.0.239.142, TCP port 5001
TCP window size: 53.7 KByte (default)
------------------------------------------------------------
[  5] local 10.0.239.241 port 51507 connected with 10.0.239.142 port 5001
[  4] local 10.0.239.241 port 5001 connected with 10.0.239.142 port 38488
[ ID] Interval       Transfer     Bandwidth
[  5]  0.0-10.0 sec    179 MBytes    150 Mbits/sec
[  4]  0.0-10.0 sec    159 MBytes    133 Mbits/sec

SN004 with fix:

Client connecting to 10.0.239.142, TCP port 5001
TCP window size: 53.7 KByte (default)
------------------------------------------------------------
[  5] local 10.0.239.241 port 57324 connected with 10.0.239.142 port 5001
[  4] local 10.0.239.241 port 5001 connected with 10.0.239.142 port 38492
[ ID] Interval       Transfer     Bandwidth
[  5]  0.0-10.0 sec    176 MBytes    148 Mbits/sec
[  4]  0.0-10.0 sec    161 MBytes    135 Mbits/sec

As a stop gap, it's ok, but the signal integrity of the refclk still looks like dogmeat.

Software notes

GPIO management

GPIO pins are exposed in /sys/class/gpio/. To convert a pin from a flat GPIO referenced in the manual to a GPIO pin referenced in /sys/class/gpio/, use the following formula:

32*(bank-1)+pin

For example, the FPGA RESET pin is connected to GPIO5[7]. Using the formula above, we get 32*(5-1)+7 = gpio135.

To enable the pin for use, echo the pin number to /sys/class/gpio/export. For example:

echo 135 > /sys/class/gpio/export

A directory will appear with the pin name, in our case "gpio135". To check the GPIO level, read the "value" file. For example:

cat /sys/class/gpio/gpio135/value

To turn the GPIO into an output, set the "direction" to "out". For example:

echo out > /sys/class/gpio/gpio135/direction

Once a GPIO has been converted to an output, the "value" file becomes writeable.

Writing to SPI

SPI attempts to fill the buffer with the entirety of your write(). By default, the size of writes exceeds the SPI buffer, causing no data to get written. Use the "bs" argument to dd in order to get around this. To send an FPGA firmware out the port, run a command similar to the following:

dd if=/home/root/novena_fpga.bit of=/dev/spidev32766.0 bs=128

FPGA

A compact script to configure the FPGA and start the clock to the FPGA at CPU native crystal frequency (24 MHz):

echo "Setting export of reset pin"
echo 135 > /sys/class/gpio/export
echo "setting reset pin to out"
echo out > /sys/class/gpio/gpio135/direction
echo "flipping reset"                       
echo 0 > /sys/class/gpio/gpio135/value
echo 1 > /sys/class/gpio/gpio135/value
                                      
echo "configuring FPGA"               
                                      
dd if=/home/root/novena_fpga.bit of=/dev/spidev32766.0 bs=128
                                                             
echo "turning on clock to FPGA"                              
devmem2 0x020c8160 w 0x00000D2B      

The FPGA is connected to the CPU via EIM. To configure EIM for use, some registers have to be poked.

This command set below configures EIM for 16-bit synchronous.

devmem2 0x020c4080 w 0xfc3
devmem2 0x021b8000 w 0x403104b1
devmem2 0x021b8008 w 0x04000000
devmem2 0x021b800c w 0x00000008
devmem2 0x021b8010 w 0x04000000

This command set below configures EIM for 16-bit asynchronous.

devmem2 0x020c4080 w 0xfc3
devmem2 0x021b8000 w 0x403104b1
devmem2 0x021b8008 w 0x0b010000
devmem2 0x021b800c w 0x00000008
devmem2 0x021b8010 w 0x00040040
devmem2 0x020c4080 w 0xfc3

# EIM_CS0GCR1   0101 0 001 1 100 0 001 00 00 0 000 1 0 1 1 1 0 0 1 
# 0101 0100 1100 0001 0000 0000 1011 1001 
devmem2 0x21b8000 w 0x54c100b9
# EIM_CS0GCR2   0000 0000 0000 0000 0001 00 0 0 0000 00 01
# 0000 0000 0000 0000 0001 0000 0000 0001
devmem2 0x21b8004 w 0x1001

# EIM_CS0RCR1   00 000001 0 001 1 000 0 010 0 010 0 010 0 010
# 0000 0001 0001 1000 0010 0010 0010 0010
devmem2 0x21b8008 w 0x01182222
# EIM_CS0RCR1   00 000100 0 010 0 000 0 000 0 000 0 000 0 000
# 0000 0100 0010 0000 0000 0000 0000 0000
devmem2 0x21b8008 w 0x04000000


# EIM_CS0RCR2   0000 0000 0 000 00 01 0 001 0 001 
# 0000 0000 0000 0001 0001 0001 
devmem2 0x21b800c w 0x00000111

# EIM_CS0WCR1   0100 0100 000 000 000 000 000 000 000 000
devmem2 0x21b8010 w 0x44000000


# EIM_WCR  0000 0000 0000 0000 0000 0 00 1 00 0 0 0 00 1
# 0000 0000 0000 0000 0000 0001 0000 0001
devmem2 0x21b8090 w 0x101

# EIM_WIAR 0000 0000 0000 0000 0000 0000 000 1 0 0 0 0
devmem2 0x21b8094 w 0x10

EIM CS0 starts at 0x08000000.


EIM DA0-15 pad configuration:

devmem2 0x020e0114 w 0x00000000
devmem2 0x020e0428 w 0x0000b0b1
devmem2 0x020e0118 w 0x00000000
devmem2 0x020e042c w 0x0000b0b1
devmem2 0x020e011c w 0x00000000
devmem2 0x020e0430 w 0x0000b0b1
devmem2 0x020e0120 w 0x00000000
devmem2 0x020e0434 w 0x0000b0b1
devmem2 0x020e0124 w 0x00000000
devmem2 0x020e0438 w 0x0000b0b1
devmem2 0x020e0128 w 0x00000000
devmem2 0x020e043c w 0x0000b0b1
devmem2 0x020e012c w 0x00000000
devmem2 0x020e0440 w 0x0000b0b1
devmem2 0x020e0130 w 0x00000000
devmem2 0x020e0444 w 0x0000b0b1
devmem2 0x020e0134 w 0x00000000
devmem2 0x020e0448 w 0x0000b0b1
devmem2 0x020e0138 w 0x00000000
devmem2 0x020e044c w 0x0000b0b1
devmem2 0x020e013c w 0x00000000
devmem2 0x020e0450 w 0x0000b0b1
devmem2 0x020e0140 w 0x00000000
devmem2 0x020e0454 w 0x0000b0b1
devmem2 0x020e0144 w 0x00000000
devmem2 0x020e0458 w 0x0000b0b1
devmem2 0x020e0148 w 0x00000000
devmem2 0x020e045c w 0x0000b0b1
devmem2 0x020e014c w 0x00000000
devmem2 0x020e0460 w 0x0000b0b1
devmem2 0x020e0150 w 0x00000000
devmem2 0x020e0464 w 0x0000b0b1