Age | Commit message (Collapse) | Author |
|
The PMU is already counting cycles by calculating time elapsed in
nanoseconds. Counting instructions is a different matter and requires
another approach.
This patch adds the capability of counting completed instructions (Perf
event PM_INST_CMPL) by counting the amount of instructions translated in
each translation block right before exiting it.
A new pmu_count_insns() helper in translation.c was added to do that.
After verifying that the PMU is counting instructions, call
helper_insns_inc(). This new helper from power8-pmu.c will add the
instructions to the relevant counters. It'll also be responsible for
triggering counter negative overflows as it is already being done with
cycles.
To verify whether the PMU is counting instructions or now, a new hflags
named 'HFLAGS_INSN_CNT' is introduced. This flag will match the internal
state of the PMU. We're be using this flag to avoid calling
helper_insn_inc() when we do not have a valid instruction event being
sampled.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20211201151734.654994-7-danielhb413@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
MMCR1 determines the events to be sampled by the PMU. Updating the
counters at every MMCR1 write ensures that we're not sampling more
or less events by looking only at MMCR0 and the PMCs.
It is worth noticing that both the Book3S PowerPC PMU, and this IBM
Power8+ PMU that we're modeling, also uses MMCRA, MMCR2 and MMCR3 to
control the PMU. These three registers aren't being handled in this
initial implementation, so for now we're controlling all the PMU
aspects using MMCR0, MMCR1 and the PMCs.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20211201151734.654994-5-danielhb413@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
Calling pmu_update_cycles() on every PMC read/write operation ensures
that the values being fetched are up to date with the current PMU state.
In theory we can get away by just trapping PMCs reads, but we're going
to trap PMC writes to deal with counter overflow logic later on. Let's
put the required wiring for that and make our lives a bit easier in the
next patches.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20211201151734.654994-4-danielhb413@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
This patch adds the barebones of the PMU logic by enabling cycle
counting. The overall logic goes as follows:
- MMCR0 reg initial value is set to 0x80000000 (MMCR0_FC set) to avoid
having to spin the PMU right at system init;
- to retrieve the events that are being profiled, pmc_get_event() will
check the current MMCR0 and MMCR1 value and return the appropriate
PMUEventType. For PMCs 1-4, event 0x2 is the implementation dependent
value of PMU_EVENT_INSTRUCTIONS and event 0x1E is the implementation
dependent value of PMU_EVENT_CYCLES. These events are supported by IBM
Power chips since Power8, at least, and the Linux Perf driver makes use
of these events until kernel v5.15. For PMC1, event 0xF0 is the
architected PowerISA event for cycles. Event 0xFE is the architected
PowerISA event for instructions;
- if the counter is frozen, either via the global MMCR0_FC bit or its
individual frozen counter bits, PMU_EVENT_INACTIVE is returned;
- pmu_update_cycles() will go through each counter and update the
values of all PMCs that are counting cycles. This function will be
called every time a MMCR0 update is done to keep counters values
up to date. Upcoming patches will use this function to allow the
counters to be properly updated during read/write of the PMCs
and MMCR1 writes.
Given that the base CPU frequency is fixed at 1Ghz for both powernv and
pseries clock, cycle calculation assumes that 1 nanosecond equals 1 CPU
cycle. Cycle value is then calculated by adding the elapsed time, in
nanoseconds, of the last cycle update done via pmu_update_cycles().
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20211201151734.654994-3-danielhb413@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
Problem state needs to be able to read and write the PMU counters,
otherwise it won't be aware of any sampling result that the PMU produces
after a Perf run.
This patch does that in a similar fashion as already done in the
previous patches. PMCs 5 and 6 have a special condition, aside from the
constraints that are common with PMCs 1-4, where they are not part of the
PMU if MMCR0_PMCC is 0b11.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20211018010133.315842-5-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|
|
Similar to the previous patch, let's add problem state read/write access to
the MMCR2 SPR, which is also a group A PMU SPR that needs to be filtered
to be read/written by userspace.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20211018010133.315842-4-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|
|
Userspace need access to PMU SPRs to be able to operate the PMU. One of
such SPRs is MMCR0.
MMCR0, as defined by PowerISA v3.1, is classified as a 'group A' PMU
register. This class of registers has common read/write rules that are
governed by MMCR0 PMCC bits. MMCR0 is also not fully exposed to problem
state: only MMCR0_FC, MMCR0_PMAO and MMCR0_PMAE bits are
readable/writable in this case.
This patch exposes MMCR0 to userspace by doing the following:
- two new callbacks, spr_read_MMCR0_ureg() and spr_write_MMCR0_ureg(),
are added to be used as problem state read/write callbacks of UMMCR0.
Both callbacks filters the amount of bits userspace is able to
read/write by using a MMCR0_UREG_MASK;
- problem state access control is done by the spr_groupA_read_allowed()
and spr_groupA_write_allowed() helpers. These helpers will read the
current PMCC bits from DisasContext and check whether the read/write
MMCR0 operation is valid or noti;
- to avoid putting exclusive PMU logic into the already loaded
translate.c file, let's create a new 'power8-pmu-regs.c.inc' file that
will hold all the spr_read/spr_write functions of PMU registers.
The 'power8' name of this new file intends to hint about the proven
support of the PMU logic to be added. The code has been tested with the
IBM POWER chip family, POWER8 being the oldest version tested. This
doesn't mean that the PMU logic will break with any other PPC64 chip
that implements Book3s, but rather that we can't assert that it works
properly with any Book3s compliant chip.
CC: Gustavo Romero <gustavo.romero@linaro.org>
Signed-off-by: Gustavo Romero <gromero@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20211018010133.315842-3-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
|