EC firmware_SoftwareSync.dev test case fails…

Martin Yan

unread,

May 2, 2018, 5:24:45 PM5/2/18

to faft users

As I tested EC firmware_SoftwareSync.dev case,

1: reboot system – pass;

2: enable developer mode – pass;

3: corrupt EC firmware RW body;

4: reboot AP, check EC hash, and software sync it – pass, software resync works;
5: disable developer mode – pass;

05/02 14:56:54.772 INFO | mode_switcher:0545| -[mode_aware_reboot]-[ is_dev=False ]-

05/02 14:56:54.773 INFO | servo:0529| Setting fw_wp_state to force_on

05/02 14:56:54.790 INFO | servo:0529| Setting ec_uart_cmd to flashwp enable

05/02 14:56:55.188 INFO | firmware_test:0932| Blocking sync for /dev/mmcblk1

05/02 14:56:55.295 INFO | servo:0529| Setting ec_uart_cmd to reboot

System can’t boot up, the screen displays “Chrome OS is missing or damaged. Plrease remove all connected devices and start recovery.”. as I hit “TAB” key, it shows,

recovery_reason: 0x54 / 0x54 TPM read error in rewriteable firmware;

or

recovery_reason: 0x2b / 0x2b Secure NVRAM (TPM) initialization error

what is the potential problem? Refer to attached “20180502-firmware_SoftwareSync_dev.zip”

other helpful information: firmware_TPMVersionCheck.ec_wp, firmware_DevMode.ec_wp and firmware_SoftwareSync are passed!

20180502-firmware_SoftwareSync_dev.zip

IMG_1096.JPG

IMG_1097.JPG

Tom Wai-Hong Tam (談偉航)

unread,

May 3, 2018, 10:40:13 AM5/3/18

to Martin Yan, faft users

Checked the log. The test logic of firmware_SoftwareSync was done, and then tried to restore back to the original write protection state, which is ON originally.

That is sending EC command "flashwp enable" and then "reboot" EC. The recovery screen failed afterward. You may check if any wrong on EC handling write protection.

And may also try different original WP state, like setting it OFF before the test, to clarify the issue.

--
You received this message because you are subscribed to the Google Groups "faft users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to faft-users+...@chromium.org.

Martin Yan

unread,

May 3, 2018, 7:03:24 PM5/3/18

to faft users, Marti...@microchip.com

Tried to force WP state as OFF via “dut-control fw_wp_state:force_off” prior to test, it passed once but failed once either.

Any remaining data in EC SRAM in previous test can cause BIOS jump to “recovery_reason 0x54 or 0x2b”? or pending state in TPM causes this?

Actually, I captured BIOS uart messages recover screen / “recovery_reason 0x54 or 0x2b” is showed, any clues?

150-bios-cpu-uart-tpm-issue

151-bios-cpu-uart-tpm-issue-2

Tom Wai-Hong Tam (談偉航)

unread,

May 7, 2018, 11:54:16 AM5/7/18

to Martin Yan, faft users

Looks like it was caused by transaction error during EC talking to the TPM:

Calling VbSelectAndLoadKernel().

tpm_get_response: tpm transaction failed for 0x14e with error 0x10000

RollbackKernelRead: Rollback: 00010000 returned by ReadSpaceKernel(&rsk)

vb2_kernel_setup: Unable to get kernel versions from TPM

tpm_get_response: tpm transaction failed for 0x14e with error 0x10000

RollbackFwmpRead: TPM: read returned 0x10000

vb2_kernel_setup: Unable to get FWMP from TPM

What is the board you use? Does it have Cr50 as TPM?

I guess the TPM was busy for some previous operation or didn't get reset properly. You may try to add to delay before the reboot to see if it helps.

Or you can try to change the reboot method from a warm_reset (default) to a cold_reset to clarify it.

Martin Yan

unread,

May 7, 2018, 12:15:27 PM5/7/18

to faft users, Marti...@microchip.com

Thanks your feedbacks.

My comments:

Looks like it was caused by transaction error during EC talking to the TPM:

[Martin]: actually, on this board design, EC doesn’t talk to TPM;

What is the board you use? Does it have Cr50 as TPM?

[Martin]: reef board, yes, Cr50 is as TPM, refer to attached successful case “152-bios-cpu-uart-tpm-issue-bootup”, CPU uart shows message,

cr50 TPM 2.0 (i2c 0x50 id 0x28)

I guess the TPM was busy for some previous operation or didn't get reset properly. You may try to add to delay before the reboot to see if it helps.

[Martin]: add delay in EC firmware or from FAFT script?

Or you can try to change the reboot method from a warm_reset (default) to a cold_reset to clarify it.

[Martin]: from FAFT script?

Regards,

Martin

152-bios-cpu-uart-tpm-issue-bootup

Tom Wai-Hong Tam (談偉航)

unread,

May 7, 2018, 12:48:08 PM5/7/18

to Martin Yan, faft users

On Mon, May 7, 2018 at 9:15 AM Martin Yan <Marti...@microchip.com> wrote:

Thanks your feedbacks.

My comments:

Looks like it was caused by transaction error during EC talking to the TPM:

[Martin]: actually, on this board design, EC doesn’t talk to TPM;

Right. It is AP talking to TPM.

What is the board you use? Does it have Cr50 as TPM?

[Martin]: reef board, yes, Cr50 is as TPM, refer to attached successful case “152-bios-cpu-uart-tpm-issue-bootup”, CPU uart shows message,

cr50 TPM 2.0 (i2c 0x50 id 0x28)

Could you check the Cr50 UART? The servo control is cr50_uart_pty, similar to ec_uart_pty.

I guess the TPM was busy for some previous operation or didn't get reset properly. You may try to add to delay before the reboot to see if it helps.

[Martin]: add delay in EC firmware or from FAFT script?

Like this:

diff --git a/server/cros/faft/firmware_test.py b/server/cros/faft/firmware_test.py

index 9a96a05d5..c43821032 100644

--- a/server/cros/faft/firmware_test.py

+++ b/server/cros/faft/firmware_test.py

@@ -638,6 +638,7 @@ class FirmwareTest(FAFTBase):

if enable:

# Set write protect flag and reboot to take effect.

self.ec.set_flash_write_protect(enable)

+ time.sleep(30)

self.sync_and_ec_reboot()

else:

# Reboot after deasserting hardware write protect pin to deactivate

Or you can try to change the reboot method from a warm_reset (default) to a cold_reset to clarify it.

[Martin]: from FAFT script?

Like this:

diff --git a/server/cros/faft/firmware_test.py b/server/cros/faft/firmware_test.py

index 9a96a05d5..761823f48 100644

--- a/server/cros/faft/firmware_test.py

+++ b/server/cros/faft/firmware_test.py

@@ -638,11 +638,11 @@ class FirmwareTest(FAFTBase):

if enable:

# Set write protect flag and reboot to take effect.

self.ec.set_flash_write_protect(enable)

- self.sync_and_ec_reboot()

+ self.switcher.simple_reboot(reboot_type='cold')

else:

# Reboot after deasserting hardware write protect pin to deactivate

# write protect. And then remove software write protect flag.

- self.sync_and_ec_reboot()

+ self.switcher.simple_reboot(reboot_type='cold')

self.ec.set_flash_write_protect(enable)

def _setup_ec_write_protect(self, ec_wp):

Martin Yan

unread,

May 10, 2018, 10:48:43 AM5/10/18

to faft users, Marti...@microchip.com

Per more tests on system,

1: Could you check the Cr50 UART? The servo control is cr50_uart_pty, similar to ec_uart_pty.

Martin: look like Cr50 UART is not wired to servo board V2, as run “cr50_uart_pty”, it shows “problem with [‘cr50_uart_pty’] :: not control name ‘cr50_uart_pty’”;

2: You may try to add to delay before the reboot to see if it helps.

Martin: add delay,

self.ec.set_flash_write_protect(enable)

+ time.sleep(30)

self.sync_and_ec_reboot()

yes, this solution improves the reliability, we can pass firmware_SoftwareSync.dev case, thanks!

3: you can try to change the reboot method from a warm_reset (default) to a cold_reset to clarify it.

- self.sync_and_ec_reboot()

+ self.switcher.simple_reboot(reboot_type='cold')

This method doesn’t improve reliability, but causes other side affects, causing other test cases failure;

4: one new observation, I hit once during regression,

14:51:55 INFO | autoserv| Setting ec_uart_regexp to None

14:51:55 INFO | autoserv| Setting ec_uart_cmd to reboot

14:51:56 INFO | autoserv| Setting ec_uart_cmd to flashwp disable

14:51:57 INFO | autoserv| -[FAFT]-[ start wait_for_client ]---

// test server time out to wait ping from DUT

System can not boot up to OS since TPM issue (stuck in recovery screen), similarly, where can we add some delays to improve reliability (I can't find flashwp disable in same py file)?

Regards,

Martin

Tom Wai-Hong Tam (談偉航)

unread,

May 14, 2018, 12:13:13 PM5/14/18

to Martin Yan, faft users

On Thu, May 10, 2018 at 7:48 AM Martin Yan <Marti...@microchip.com> wrote:

Per more tests on system,

1: Could you check the Cr50 UART? The servo control is cr50_uart_pty, similar to ec_uart_pty.

Martin: look like Cr50 UART is not wired to servo board V2, as run “cr50_uart_pty”, it shows “problem with [‘cr50_uart_pty’] :: not control name ‘cr50_uart_pty’”;

2: You may try to add to delay before the reboot to see if it helps.

Martin: add delay,

self.ec.set_flash_write_protect(enable)

+ time.sleep(30)

self.sync_and_ec_reboot()

yes, this solution improves the reliability, we can pass firmware_SoftwareSync.dev case, thanks!

3: you can try to change the reboot method from a warm_reset (default) to a cold_reset to clarify it.

- self.sync_and_ec_reboot()

+ self.switcher.simple_reboot(reboot_type='cold')

This method doesn’t improve reliability, but causes other side affects, causing other test cases failure;

4: one new observation, I hit once during regression,

14:51:55 INFO | autoserv| Setting ec_uart_regexp to None

14:51:55 INFO | autoserv| Setting ec_uart_cmd to reboot

14:51:56 INFO | autoserv| Setting ec_uart_cmd to flashwp disable

14:51:57 INFO | autoserv| -[FAFT]-[ start wait_for_client ]---

// test server time out to wait ping from DUT

System can not boot up to OS since TPM issue (stuck in recovery screen), similarly, where can we add some delays to improve reliability (I can't find flashwp disable in same py file)?

The "flashwp" command is in another python module: cros/servo/chrome_ec.py

Martin Yan

unread,

May 22, 2018, 9:18:16 AM5/22/18

to faft users, Marti...@microchip.com

Got it, thanks!

Reply all

Reply to author

Forward