31 Commits

Author SHA1 Message Date
Shengwei Luo
12b4bb0fd0 Fix cpu isolate errors when some cpus are offline before the service started
Signed-off-by: slim6882 <yangjunshuo@huawei.com>
2024-04-25 17:09:08 +08:00
Bing Xia
4cdf0a2c6b rasdaemon: Fix for vendor errors are not recorded in the SQLite database if some cpus are offline
Fix for vendor errors are not recorded in the SQLite database if some cpus
are offline at the system start.

Signed-off-by: Bing Xia <xiabing12@h-partners.com>
2024-04-23 15:20:14 +08:00
caixiaomeng
544fd1a7d7 add dynamic switch of ras events support and disable block_rq_complete 2024-04-08 17:30:57 +08:00
zhangruifang2020
a5a053ef71 backport upstream patches 2024-03-25 14:27:40 +08:00
caixiaomeng
639a2e6a2b fix rasdaemon disable service after upgrade 2023-12-28 16:37:23 +08:00
caixiaomeng
090cf6e9c1 backport upstream patches 2023-12-20 15:11:39 +08:00
renxichen
b1112cd6a0 bugfix on rasdaemon.service 2023-12-01 17:17:09 +08:00
xia-bing1
9ea0f76ce8 rasdaemon: ras-mc-ctl: Modify check for HiSilicon KunPeng9xx error fields 2023-11-02 20:47:05 +08:00
znzjugod
9673ae5fce ras-events:quitloopread_ras_eventwhenkbufdataisbroken 2023-06-20 09:56:13 +08:00
Shiju Jose
e545cf1707 rasdaemon: Add fix patches for rasdaemon and Add support for creating the vendor error tables at startup
Add the following fix patches and changes,
1. Fix return value type issue of read/write function from unistd.h.
2. Fix issue of signed and unsigned integer comparison.
3. Remove redundant header file and do some clean-up.
4. Add support for create/open the vendor error tables at rasdaemon startup.
5. Make changes in the HiSilicon error handling code for the same.
6. Add four modules supported by HiSilicon common section.

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
2023-06-02 13:09:02 +01:00
Lv Ying
2b78b3ecf0 rasdaemon/libtrace: fix libtrace wrong parse ring buffer traceevent when there is TIMESTAMP traceevent
Signed-off-by: Lv Ying <lvying6@huawei.com>
2023-05-30 23:35:47 +08:00
huangfangrun
a8c4df69fe rasdaemon: Fix for regression in ras_mc_create_table() if some cpus are offline at the system start and Fix poll() on per_cpu trace_pipe_raw blocks indefinitely 2023-04-04 15:17:47 +08:00
Lv Ying
cdba1b6d27 fix ras-mc-ctl.service startup failed when selinux is on 2023-03-29 15:53:27 -06:00
renxichen
c80dbc7963 backport patches from upstream 2023-03-23 16:03:12 +08:00
Lv Ying
40f0a70c6e rasdaemon/diskerror: fix incomplete diskerror log
Signed-off-by: Lv Ying <lvying6@huawei.com>
2023-02-16 15:52:21 +08:00
fenglei
a39ae66460 rasdaemon: Fix startup core dumped issue.
Add the following patch to fix startup core dumped issue.
    0001-rasdaemon-use-standard-length-PATH_MAX-for-path-name.patch

    Signed-off-by: fenglei <fenglei47@h-partners.com>
2022-10-28 11:49:29 +08:00
Shiju Jose
0fb80a07eb rasdaemon: Update with the latest patches for the CPU fault isolation, Hisilicon Kunpeng9xx common error records and improvements in the ras-mc-ctl for the Hisilicon Kunpeng9xx errors
Update with the latest patches for the
1. CPU online fault isolation for arm event.
2. Modify recording Hisilicon common error data in the rasdaemon
3. In the ras-mc-ctl,
3.1. Improve Hisilicon common error statistics.
3.2. Add support to display the HiSilicon vendor-errors for a specified module.
3.3. Add printing usage if necessary parameters are not passed for the HiSilicon vendor-errors options.
3.4. Reformat error info of the HiSilicon Kunpeng920.
3.5. Relocate reading and display Kunpeng920 errors to under Kunpeng9xx.
3.6. Updated the HiSilicon platform name as KunPeng9xx.
4. Fixed a memory out-of-bounds issue in the rasdaemon.

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
2022-05-30 09:52:55 +01:00
Shiju Jose
d12d3023a9 rasdaemon: Modify the format of the Hisilicon Kunpeng9xx common error records and improvements in the ras-mc-ctl for the Hisilicon Kunpeng9xx errors
1. Modify the recording format of the Hisilicon Kunpeng9xx common errors in the rasdaemon.
2. In the ras-mc-ctl,
2.1. Modify the error statistics for the HiSilicon Kunpeng9xx common errors to display
     the statistics and error info based on the module and the error severity..
2.2. Add support to display the vendor-errors for a specified module.
2.3. Add printing usage if the necessary parameters are not passed for the
     vendor-errors options.
2.4. Reformat error info of the HiSilicon Kunpeng920.
2.5. Relocate reading and display Kunpeng920 errors to under Kunpeng9xx.

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
(cherry picked from commit ca01a3db7b2b002855070d02a095296680325354)
2022-03-28 16:09:21 +08:00
Xiaofei Tan
cefb2d1861 Fix some issues:
1.Backport 4 patches from openEuler master branch.
2.Enable compilation of the feature memory fault prediction based on corrected error.
3.Fix changelog date error of this spec file.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
(cherry picked from commit beb85dd5cfd3920dee725abb2e07fffd13f3dc1c)
2022-03-28 15:45:50 +08:00
Lostwayzxc
d46cbe4559 add cpu online fault isolation for arm event
(cherry picked from commit ac231c1c3131299c48d46780887ffb469d677de5)
2022-03-28 15:07:01 +08:00
xujing
6e9b370dfd rasdaemon: update software to v0.6.7 2021-12-08 11:20:47 +08:00
Xiaofei Tan
defae9844b rasdaemon: backport hisilicon common section patches from community 2021-07-30 14:29:14 +08:00
xujing
959a34bc6f rasdaemon: fix disk error log storm 2021-05-15 18:24:31 +08:00
lvying
0b9d06a79e rasdaemon: backport bugfix patch from community
Fix error print handle_ras_events:
00115dda85

Signer-off-by: lvying <lvying6@huawei.com>
2021-04-28 12:06:47 +08:00
Lv Ying
d708bdf82b rasdaemon: backport bugfix patches from community
1. ras-page-isolation: do_page_offline always considers page offline was successful
e4d27840e1
2. ras-page-isolation: page which is PAGE_OFFLINE_FAILED can be offlined again
c329012ce4
2021-03-31 11:28:01 -07:00
lvying
a78599da8d rasdaemon:update Source0
Signed-off-by: lvying <lvying6@huawei.com>
2020-09-25 01:15:25 -07:00
chengquan
600f22fc95 update software to v0.6.6 2020-07-24 16:47:49 +08:00
chengquan
7e5fdbf6c8 fix file descriptor leak in ras-report.c:setup_report_socket() 2020-03-18 15:44:27 +08:00
chengquan
7970b44d84 fix file descriptor leak in ras-report.c:setup_report_socket() 2020-03-11 16:38:47 +08:00
zhuchunyi
ae37fc0910 update code 2019-11-06 19:51:08 +08:00
overweight
4f1fc1e5a7 Package init 2019-09-30 11:16:11 -04:00